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VARIANTS OF THE SECANT METHOD FOR SOLVING 
NONLINEAR SYSTEMS OF EQUATIONS 

Clarence Cantor 
ABSTRACT 

Some variants of the Secant Method are developed for solving f (x) = o, 
n nonlinear equations in n unknowns. The new methods, consisting of 
Algorithms I and H, depart from existing versions of the Secant Method 
whenever certain conditions arise that would tend to cause poor con- 
vergence of the latter. These conditions are ascertained via simple 
test criteria associated with the new algorithms. When these criteria 
are satisfied initially at each step, the algorithms follow the same steps 
as existing versions of the Secant Method and, under certain assump- 
tions, are shown to possess superlinear convergence. Whenever these 
test criteria are not satisfied initially, the algorithms follow logical 
alternate procedures that provide a basis for linear convergence. The 
results of numerical experiments with a series of randomly generated 
problems support the claim of improved convergence for the new 
methods. Of the two new algorithms, Algorithm II is judged to be the 
superior and warrants further numerical investigation. 
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VARIANTS OF THE SECANT METHOD FOR SOLVING 
NONLINEAR SYSTEMS OF EQUATIONS 

INTRODUCTION 

The problem of solving f (x) = 0, n nonlinear equations in n unknowns, his 
many applications. For example, the equilibrium points of the n-th order 
dynamical system represented by the vector equation x = f (x) are simply the 
solutions 5c to f (x) = o, In a discrete system given by the vector equation 
x k + 1 = g (x k ), the equilibrium points are obtained by solving x = g (x) which is 
equivalent to solving f(x)= g(x) - x = 0. The problem can also arise as a 
result of the requirement to minimize (or maximize) some functional; equating 
the gradient to zero results in f (x) = 0 , n equations in n unknowns. Except 
for very specialized sets of nonlinear equations, all methods for solving f (x) = C 
depend on iterative techniques, which are readily implemented by computers. 

Some well known methods for solving f (x) = 0 are summarized in Section I 
with the emphasis on existing versions of the Secant Method and their limitations. 
The Secant Method has the attractive feature of requiring only one evaluation of 
the vector f (x) per iteration (as opposed to n + 1 function evaluations in a dis- 
crete Newton's Method for example). However it suffers from poor convergence 
at times and this characteristic becomes more prevalent as the order of the 
system increases. The goal behind the development of the Secant Method vari- 
ants described in Section II is to alleviate the conditions tending to cause poor 
convergence in present versions of the Secant Method. The results of theoretical 
considerations (Section HI) and numerical experiments (Section IV) indicate that 
this goal has been largely realized. 
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SECTION I 


BACKGROUND 


Newton’s Method in n dimensions (see e.g., [11] , [16] , [17] , and [23]) is 
probably the most widely known method for solving the n -th order vector equa- 
tion f (x) = o. Analogous to the case of one dimension, Newton’s Method in n 


dimensions is given by 




K, 


1 . y ? 


x k + 1 = x k “ J” l (x k ) f( xk ) 


( 1 . 1 ) 


(assuming J _1 (x k ) exists) 


where 

f (x k ) is the value of f (x) at x = x k 

J (x k ) is the Jacobian matrix of f (x) evaluated at x k 

x k is the present (k -th) estimate of the n -dimensional solution 

I 

and 


X k + 1 is the next estimate of the solution. 

Newton’s Method is a direct result of linearizing the system f (x) = 0 about the 
point x = x k . A Taylor’s expansion of f (x) about x = x k , in which all terms 
above the first order are dropped, yields (1.1). 


Assuming f (x) is continuously differentiable in a neighborhood of a solution x 
and that the Jacobian matrix is nonsingular in this neighborhood, Newton's 
Method will converge to x if the starting estimate x 1 is "sufficiently close" to 
x . In practice, Newton's Method suffers from three principal disadvantages: 


1. The region of convergence in some problems may be very small. Thus the 
method may fail to converge for all but fairly accurate initial estimates x 1 . 
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2. The Jacobian J (x k ) requires the evaluation of n 2 partial derivatives at each 
iterate x k . This can involve a considerable amount of computations even when 
the partial derivatives are approximated by finite differences, since n + l 
evaluations of f (x) per iteration would then be required. 

3. The inverse J" 1 (x k ) must be calculated at each step, again involving an 
undue amount of computations. 

These difficulties are more or less overcome in the class of quasi-Newton 
Methods represented by 

x k+i = x k _ a k H k f k (1.2) 

where 

H k is some approximation to J~ 1 (x 1 ' ) 
f k = f (x k ) 

and 

a k is some real number greater than zero. 

If H k = J~ 1 (x k ) and a k is chosen to minimize the norm <P k + 1 , or to reduce ® k + 1 
below <P k , where 


n 



i= 1 


(f k = i-th component of vector f k ) 
then one has a reduced-step Newton's Method, namely 

x k+l = x k - a k J -1 (x k ) f k . (1.4) 

At each step in (1.4) a value of a k greater than zero can always be found which 
will reduce <P k + 1 below <p k , provided that J (x k ) is non-singular and provided 
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that f (x k ) £ o. This is easily proved by showing that the direction of Ax k given 
by (1.4), namely - J" *(x k ) f k , has a component along the negative gradient of <P k , 
i.e., by showing that 

(gradcp k , J _1 (x k ) f k ) > 0 . (1.5) 

One has 


3 cp k 

3 x. k 

1 



3 f k 



f k 


or 


grad cp k - 2 J T (x k ) f k . 


( 1 . 6 ) 


Then 


( grad cp k , J“ 1 (x k ) f k ) = 2(f k ) T J (x k ) J 1 (x k ) f k 

= 2(f k ) T f k > 0, f k * 0. (1.7) 


This reduced-step Newton's Method, with a k selected to reduce <P k + 1 below <P k , 
will converge to a solution of f (x) = 0 for any starting value in a region 
bounded by the contour cp = constant, if a solution exists in this region and if 
J -1 (x) exists everywhere in this region. This region of convergence is as large 
or larger than that in the standard (full-step) Newton’s Method so that the dis- 
advantage of a small region of convergence often encountered in the full-step 
Newton's Method is somewhat alleviated by the reduced-step variant. However, 
the reduced-step version still has the disadvantages of requiring the calculation 
of J (x k ) and its inverse at each step. 
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The Method of Steepest Descent, or Gradient Method (see e.g., [11] , [16] , [17] , 
[19]), for solving f (x) = 0 can be regarded as a quasi-Newton method by 
letting H k = J T (x k ), with a k selected to minimize or reduce the norm q> k+1 . 

Thus one has 

x k + l = x k - a k jT( x k) fk. (1.8) 

Note that grad <p k = 2 J T (x k ) f k so that the direction of Ax k given by (1.8) is 
along the path of steepest descent. Thus a value of a k > 0 can always be found 
to reduce cp k+1 below cp k as long as J T (x k ) f k ^ 0. Assuming J (x k ) is non- 
singular, f k f 0 implies J T (x k ) f k f 0. In addition, we can sometimes satisfy 
J T (x k ) f k ^ o even though J (x k ) is singular. Thus the region of convergence in 
the Gradient Method is as large or larger than that in the reduced-step or regular 
(full-step) Newton’s Method. In addition, this method eliminates the need to 
calculate the inverse of J (x k ). However, the Gradient Method lacks the quadratic 
convergence of Newton's Method. Its behavior near a solution of f (x) = o is 
rather oscillatory in general. 

A method which combines the feature of the larger region of convergence inherent 
in the Gradient Method together with the quadratic convergence of Newton's 
Method near a solution, is Marquardt’s Algorithm [ 14] , 

x k + l = x k _ [j T (x k ) J(x k ) + \ k I] 1 J T (x k ) f k . (1.9) 

As k k - 0, (1.9) becomes Newton's Method, while \ k - 00 yields the Gradient 
Method. In general, a small value of is sought in order for the method to 
approach the quadratic convergence of Newton's Method, and is increased 
only as necessary to satisfy 
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q,k + 1 < q,k ( 


(1-10) 


which is the basic criterion that must be satisfied at each step. While the con- 
vergent properties of Marquardt's algorithm are improvements over those in 
Newton's Method alone or the Gradient Method alone, the algorithm requires 
considerably more computations at each step than either of the latter two 
methods. 

If H k = J _1 (x 1 ) and a k = l, one obtains the simplified Newton's Method, 

x k + i = x k _ j~ 1 (x 1 ) f k . (1.11) 

This simplified Newton's Method eliminates the need for frequent calculation 
of J (x k ) and its inverse. In general, this method does not converge as rapidly 
as Newton's Method. In particular, it lacks the quadratic convergence of Newton's 
Method, unless by some odd chance, J (x) = J (x 1 ) where x is the solution. In 
addition it suffers, in common with Newton's Method, the fault of a small region 
of convergence in many problems. This fault can be alleviated somewhat, as 
was done for Newton's Method, by a reduced-step version, namely, 

x k+1 = x k - a k J _1 (x 1 ) f k (1*12) 

where a> is chosen to minimize <P k+1 or to satisfy the condition 

<P k + 1 < <P k • (1.13) 

< 

The Secant Method [2] , [16J can be expressed as a quasi-Newton method in 
which a k = i and H k = (J k ) _1 is the inverse Jacobian of the equivalent linear 
system corresponding to the last n + 1 points. It is a generalization of the 
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Secant Method in one dimension, in which a straight line is passed through the 
two points (x k , f k ) and (x k_ 1 , f k-1 ); the intersection of this line with the x axis 
determines x k+1 . Thus in n dimensions, one has 


x k + l = x k - (Jk)-i f k . 

The matrix J k (or (J k )“ 1 ) is determined from the requirement that 
f* - J k [x* - x*] , i - k, k - 1, ...,k - n 

where x* = x k+1 is the next estimate of the solution. Letting 


and 


equation (1.15) yields 


A: 


= x i + 1 


X A 


Afi ^ fi + i 


f * 


Afi = J k Ax* , i = k - 1, k - 2 k - n. 

With the n x n matrices A X k and A F k defined as 


and 


AX'* = [A x k_n • • • Ax k " 2 Ax k_1 ] 


A F k = [A f k " n ... A f k " 2 A f k ~ J ] 


equation (1.17) yields 


(1.14) 


(1.15) 


(1.16) 


(1.17) 


(1.18) 


A F k = J k AX'* 


(1.19) 


or 
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jk = AF k (AX' 1 ) -1 


(assuming (AX k ) _1 exists) 

and 

H k 4 (jk)-i = A X k (A F k ) -1 (1.20) 

(assuming (A F k )“ 1 exists). 

Thus, solving equation (1.20) for (J k ) _1 and substituting in (1.14) yields x k + 1 , 
the next estimate of the solution to f (x) = o, after, which the process is repeated. 
The Secant Method is defined even if AX k is singular, since the existence of H k 
requires only that (Af 14 )" 1 exists. However, if AX k is singular (implying H k is 
singular), then it can be shown that the iteration cannot converge to a solution 
except under specialized conditions. Note that J k in the Secant Method is not 

the same as J (x k ) in Newton’s Method. The latter represents the Jacobian at 

3 f i 

the point x k (formed from n 2 partial derivatives ^ evaluated at x = x k ) while 

x j 

J k is the Jacobian of the linear system (1.15) that interpolates the n + 1 points 
(x k , f k ), (x k_1 , f k_1 ) (X k “ n , f k-n ). 

An equivalent representation of the Secant Method is obtained by defining 

8 X* = X k - X 1 , S fi = f k - fi (1.21) 

i = k - 1, k ~ 2, . . . , k - n 
and n x n matrices 8 X k and 8F k as 


S X k = [s x k-n ... 8 x k “ 2 8 x k_1 ] 


( 1 . 22 ) 


and 
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S F* = [s f k_n . . . S f k “ 2 S f k_l ] . 


Then equation (1.15) yields 


or 


and 


8 f 1 = J k S x*. i = k - 1 k - n 

S I* = J k 8 Xk 

J k = SFkCSXk) -1 
(assuming (8 X k ) _1 exists) 

(j k )-i = § x k (8 Fk) -1 

(assuming (8F k ) _1 exists). 


(1.23) 

(1.24) 


(1.25) 


The piatrix J k and its inverse, defined by (1.24) and (1.25), are the same as the 
corresponding ones obtained from (1.19) and (1.20), since both representations 
are derived from the same linear system (1.15); assuming the existence of the 
required inverses, n + 1 points (x 1 , f *) uniquely define the parameters of an 
n-th order linear system. It is also easy to show that (SX k ) -1 exists if and only 
if (AX k ) -1 exists and that (SF k ) -1 exists if and only if (AF k ) _1 exists. Thus 
equations (1.25) and (1.14) form an equivalent representation of the Secant 
Method. 

The Secant Method avoids the necessity of calculating J (x k ) (and its n 2 partial 
derivatives) at each step but still requires (in the quasi-Newton form of the 
Secant Method) a matrix inversion and multiplication at each step. The Secant 
Method has local super linear convergence provided the last n vectors Ax' 


1-8 



forming AX k remain sufficiently independent (see Section III). But there is no 
way of insuring that this condition is met at every step, so that convergence, let 
alone superlinear convergence, is not assured in general. Even when local con- 
vergence occurs, the region of convergence may be rather limited in many 
problems as is the case for the full-step Newton’s Method. Again, this region 
can be enlarged in the reduced-step version, namely 

x k + i = x k _ a kQk)-i fk (1.26) 

with (J k ) -1 determined as before from (1.20) or (1.25) and a k selected to satisfy 
(1.13). 

Probably the greatest drawback of the Secant Method is the numerical instability 
and lack of convergence resulting from near singular matrices AX k and AF k . 
This occurs when Ax k_1 (Af k_1 ) is "nearly" a linear combination of the previous 
n - 1 vectors. The matrix J k determined from such near singular matrices 
fails to approach J (x k ) at the solution, thus ruining the theoretical convergence 
of the Secant Method. 

It is not necessary to calculate J k or its inverse explicitly in the Secant Method 
in order to solve for x* = x k+1 , the next estimate of the solution. Following 
Wolfe's derivation [25] , one can denote the last set of n + 1 points by (x 1 , f 1 ), 
(x 2 , f 2 ) (x n + 1 , f n + 1 ). Then equation (1.15) would read 

f 1 = J(xi - x*), i = 1, 2 .... n + 1 (1.27) 

where J is the matrix of the linear system corresponding to these n + 1 points 
(x 1 , f 1 ). Since the n + l vectors f 1 must be linearly dependent, there exist con- 
stants tt . not all zero such that 
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*> = ° 


(1.28) 


One can normalize .the constants n. such that 


£ n i = i 


Utilizing equation (1.27) in (1.28), and multiplying by J" 1 which is assumed to 


exist, one obtains 


£ rr i (x 1 " x*) - 0. 


Solving for x* (noting that l TT i =1) yields 


£ n i xi 


The n + 1 constants rr . can be obtained from equations (1.28) and (1.29). With 
the (n + i) x (n + 1) matrix A defined as 


f 1 f 2 fn+1 


11 ... 1 


and the (n + l)-vector rr as 


A r ,T 

= [TTj TT 2 . . . 7T n+1 ] 


equations (1.28) and (1.29) can be written as 

At t = [0 0 . . . 0 l] 


(1.34) 
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or 

7T = A' 1 [0 0...0l] T (1.35) 

Equations (1.35) and (1.31) yield x* , the next estimate of the solution. Wolfe 
suggests using the pair (x*, f (x*)) to replace the pair (x j, f > ) for which the 
norm <P‘ = (f i ) T f 1 is a maximum. Denoting the new matrix of (1.32) as A* (in 
which f (x*) has replaced ), he uses the Sherman- Morrison modification 
method [ 12] , [20] to obtain a simplified formula for calculating (A*)" 1 based on 
knowledge of the previous inverse, A -1 . 

This version of the Secant Method has several drawbacks. For one, the matrix J 
is not explicitly calculated, and in some problems it is necessary or desirable 
to obtain an approximation to the Jacobian of the system at the solution. Sec- 
ondly, it is inherently a full-step Secant Method; it does not lend itself to the 
introduction of a constant o- k different from unity. Thus the norm reducing 
feature of the reduced-step method, which can enlarge the region of convergence, 
is not available in this version. Finally, the matrix A often becomes ill-conditioned 
as we approach a solution, resulting in the same degraded convergence or lack of 
convergence near the solution that afflicts other versions of the Secant Method. 

This problem can be alleviated in the proposed variants of the Secant Method, 
where (J k ) _1 is available at each step. This will be discussed later. 

Barnes' version of the Secant Method [ 1] starts with an initial estimate J 1 of 
the Jacobian, and develops corrections to the estimated Jacobian J at each step 
as 

jk + i = jk + D k (1.36) 
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where 


^ = f * 1 (»>) T 

( z k ) T A x k 

and A x k is obtained in the usual way (e.g., see (1.14) ) from 

J k A x k = - f k . (1.37) 

If k 1 n , the vector z k at each step is selected to be orthogonal to the previous 
(k - l) vectors Ax 1 , Ax 2 , . . ..Ax* 1 " 1 . If k > n, z k is selected to be orthogonal to 
the previous n - 1 vectors Ax k-n+1 , . . ., Ax k_1 . The result of this choice, using 
(1.36), is that 

D 14 A x* = 0, 0 < k - i < n (1.38) 

and 

J k + 1 Ax* = J k Ax 1 , 0 < k - i < n . (1.39) 

Since (1.39) is true for any k such that 0 < k - i < n , this implies that 

J k + 1 Ax* = Jk A X* = Jk - 1 Ax* = ... = J i + 1 Ax 1 . (1.40) 

Utilizing (1.36) once more, yields 

Ji + x Ax 1 = Ji Ax 1 + f i + 1 (1.41) 

or 

J i + 1 Ax* = A f‘ 

using equation (1.37). Hence, from (1.40) and (1.41), 

J k A x‘ = Afi, 0 < k - i < ri . (1.42) 
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I 


If k > n , one has 


jk Ax 1 = Afi, i = k - 1, k - 2 k-n 


(1.43) 


This is exactly the same as equation (1.17) which forms the basis for the Secant 
Method. Thus, in n iterations, starting from initial estimates J l and x 1 , 

Barnes' algorithm generates n + 1 points (x 1 , f *), (x 2 , f 2 ), . . . , (x n + 1 , f n + 1 ) and 
a matrix J n + 1 which is the Jacobian of the linear system interpolating these 
n + 1 points. Every subsequent iteration drops the earliest point and adds the 
most recent one, modifying the J matrix to correspond to this most recent set 
of n + 1 points. In other words, after n iterations Barnes' algorithm is exactly 
the same as the Secant Method as previously derived. 

Barnes' algorithm has the advantage of not requiring an initial set of n + 1 points 
in order to begin the Secant Method. It can begin with only initial estimates J 1 
and x 1 and form as a matter of course the n + 1 points leading to the regular 
Secant Method. Of course, the better the initial estimate J 1 , the faster will be 
the convergence in general. For a linear system the algorithm will theoretically 
converge to a solution of f (x) = o within n + 1 iterations, regardless of the 
initial estimate J 1 , since the matrix J n + 1 corresponding to the n + 1 points 

(x 1 , f 1 ), (x n+1 , f n + 1 ) completely defines the linear system; the next 

(n + 1) application of (1.37) (with k = n + 1) yields the solution of the linear 
system f (x) = o. Barnes' algorithm has the disadvantage of requiring the solu- 
tion of n linear equations at each step (in solving (1.37) for Ax k ), which is almost 
equivalent to requiring the inversion of J k at each step. It also, in its original 
form, represents a full-step Secant Method (a k = l) so that its region of con- 
vergence is restricted compared to the reduced-step version. 
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Rosen's modification [18] of Barnes' algorithm eliminates these last two disad- 
vantages. In addition to restructuring the algorithm as a reduced-step method, 
Rosen formulates the algorithm to update (J k ) _1 directly rather than J k . This 
is accomplished as follows. Let 

Ax k = " a k (J k ) _1 f k (1.44) 


corresponding to a reduced-step method, and let the correction to J k be repre- 
sented as 

jk + i = jk + D k (i . 45) 

where 

^ A (A f k - J k A x k ) (z k ) T 
(z k ) T A x k 


With z k selected as in Barnes' algorithm, this yields the reduced-step Secant 
Method after n iterations, in the same way that Barnes' original algorithm 
produced the full-step version. However, instead of updating J k as in (1.45) , 
Rosen uses the Sherman-Morrison modification formula [12] , [20] to convert 
(1.45) into an updating of H 11 = (J k ) _1 . Thus, one obtains 


H k + 1 


Hk + (x k - H k f k ) (z k ) T H k 
(z k ) T H k Af k 


(1.46) 


where 


H k 


A 


(J k )* 1 - 


Thus H k + 1 can be calculated fairly simply from a knowledge of H k . Then A x k 
can be calculated directly from (1.44) without the need for solving a set of n 
linear equations or of completely recalculating the inverse of J k at each step. 
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The Rosen-Barnes algorithms do not solve the problem of numerical instability 
inherent in the Secant Method whenever the matrix AX k + 1 (or AF k+ 1 ) becomes 
ill-conditioned as a result of Ax k (or A f k ) being "nearly" a linear combination 
of the previous n - 1 vectors. In the Barnes algorithm, the first condition mani- 
fests itself whenever 


l(z k ) T Ax k 1 < ^ 

l|z k H 2 ||Ax k II 2 


(1.47) 


where e is some small positive number. This condition would raise doubts as 
to the validity of the results obtained in calculating- J k+1 from equation (1.47). 
In Rosen's modification, the second condition would be evident whenever 


l(z k ) T H k A f k l < 
|| (H k ) T z k || 2 || A f k || 2 


(1.48) 


I where e' is some small positive number. Again, this condition would detract 
i from the reliable computation of H k + 1 from (1.46). Barnes considers the problem 

f 

i and suggests rejecting the vector Ax k whenever the condition of equation (1.47) 

, occurs. He claims that a satisfactory alternative is to select a Ax k for this step 
: so as to result in z k being parallel to Ax k . Of course this cures the numerical 
problem associated with the reliable computation of J k + 1 . However, it does little 
to insure fast convergence to a solution of f (x) = o since the direction of the 
selected Ax k would be quite different from the presumably optimum direction 
suggested by the previous value of J k . 


Broyden's algorithms [4] form another quasi-Newton method, which can be 
related to the Rosen-Barnes version of the Secant Method. Broyden's Method 1 
algorithm (which he found to be effective) is 



x k * a k H k f k 


(1.49) 


x 


k + 1 = 


and 


H k + i = H u + (Ax k - Hk A (A x k ) 7 H k 

(Ax k ) 7 H k Af k 


(1.50) 


where H k is an approximation to J -1 (x k ). 

Note that this can be obtained from the Rosen-Barnes algorithm of equation (1.46) 
by letting z k = Ax k . Recall that in the latter method, z k is selected to be 
orthogonal to the previous i vectors Ax k_i , . . ., Ax k " 2 , Ax k-1 , where i < n - 1. 
Gram-Schmidt orthogonalization is used to accomplish this. Thus, starting with 
k = 1, one would form: 


z 1 = Ax 1 


/\ 



f 'v 


Y' 



z 2 


Ax 2 


(z 1 ) 7 Ax 2 t 
(z 1 ) 7 z 1 


(1.51) 


z 


k 


A x k 


- E 


i * 1 


(z 1 ) 7 A x k 
(zi) T Z i 


k < n . 


(For k > n , the above expressions are modified to form z k orthogonal to the 
previous (n - 1) vectors Ax 1 .) 

Hence, when only one vector (Ax 1 ) is available, the selection of z 1 = Ax 1 in the 
Rosen-Barnes algorithm is the same as would be made in Broyden’s Method 1 
algorithm. However, as additional vectors Ax 2 , Ax 3 , etc., become available, z k 

\ 
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in the Rosen-Barnes algorithm are formed from (1.51), whereas Broyden's 
algorithm continues to use z k = Ax k . The choice of z k as per equation (1.51) 
results in the Rosen-Barnes algorithm evolving into the Secant Method after n 
iterations, so that for a linear system the matrix H n + 1 equals J -1 , the inverse 
Jacobian of the system, and the next (n + 1) application of equation (1.44) results 
in the solution of the linear system f (x) = o. There is no such assurance in 
Broyden's algorithm that the method will converge in n + l iterations for a linear 
system, or even converge at all, although subsequent work by Broyden [6] estab- 
lishes conditions under which his method converges for linear systems. Numeri- 
cal experience seems to indicate that Broyden's Method 1 algorithm is effective 
in a wide variety of problems, especially if the initial estimate H 1 of the inverse 
Jacobian is a good one, and/or the initial estimate x 1 is close to the solution. 

If this is not the case, the Rosen-Barnes algorithm is considered to be more 
effective. 


Rosen indicates that on a particular test problem, better results were obtained 
using a z k that was the average of that obtained from his algorithm with the z k 
obtained from Broyden's algorithm, than were obtained using either procedure 
alone. This author is inclined to think that one of the reasons for this is that 
Broyden's algorithm inherently avoids the numerical problems evidenced by the 
occurrence of condition (1.47). In fact, for Broyden's algorithm, 


I (z k ) T A x k | 
ii Z k II 2 II A X k II 2 


(1.52) 


since z k = Ax k . The average of the z k 's from the two methods will consistently 
avoid the condition of (1.47). At the same time, this average may still produce 
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a fair representation of the linear system corresponding to the previous 
points. 

Broyden’s Method 2 consists of equation (1.49) together with 


H w + i = H k + x k - H k A f k ) (Afk)T 

(A f k ) T A fk 


(1.53) 


where H k is again an approximation to the inverse Jacobian. Equation (1.53) 
can be obtained from (1.46) by letting 


(z k ) T H k = (Af k )T. 


(1.54) 


Equation (1.53) can also be derived from Zeleznik' s form of a generalized 
quasi-Newton Method [26] , namely equation (1.49) combined with 


H k + 1 


tI . A x k (u k ) T H k A f k (v k ) T 

H* + _ , 

(u k ) T A f k (v k ) T A f k 


(1.55) 


Letting u k = v k = Af k results in (1.53). Zeleznik shows that for a linear system, 
in order for (1.55) to converge to the inverse Jacobian within n iterations, the 
following conditions must be satisfied. 


(u k ) T A f 1 = 0, ( v k ) T A f* = 0. 


(1.56) 


0 < k ~ i < n 


But, as Zeleznik points out, the choice of u k = v k = Af k gives no assurance of 
meeting (1.56); hence there is no assurance that Broyden’s Method 2 will con- 
verge even for a linear system. However under certain conditions, namely 
when II I - AH k ° || 2 < l at some instant k 0 , it is easy to show that for a k e (0, 1] 
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Broyden's Method 2 converges to the solution x of the linear system f (x) 

= A(x - x) = 0. Still, numerical experience has indicated that Broyden's 
Method 1 is effective while Method 2 is not. This author is of the opinion that 
the ineffectiveness of Method 2 is due largely to its inability to insure against 
singularity of H k , a condition which would prevent convergence of the iteration 
except under specialized conditions. Broyden's Method 1 does prevent singu- 
larity of H k as shown by the following. 

Substituting Ax k from (1.49) in (1.50) yields 


H k + 1 


H k - 


H k (a k f k + A f k ) (A x k ) T H k 
(A x k ) T H k A f k 


or 


H k + 1 = H k (I - o-u v T ) 


(1.57) 


where 


A 1 

(Ax k ) T H k A f k 


u = a k f k + A f k , v = (H k ) T A x k . 


Assume that det H k / 0. Then det H k + 1 ^ o if and only if det (I - cruv T ) ^ o. 
But 


det (I - <ruv T ) - 1 — cry T u (see [12]) 

(A x k ) T H k (a k f k + A f k ) 
(A x k ) T H k A f k 


(A x k ) T (A x k - H k A f k ) 
(A x k ) T H k A f k 


(using (1.49)) 
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_ (A x k ) T A x k 

(A x k ) T H k A f k (1-58) 

(assuming (Ax k ) T H k Af k f o). 

Thus Broyden's Method 1 insures that H k + 1 is non-singular if H k is non-singular. 


It is also of interest to note that the Rosen-Barnes (Secant) Method will insure 
the non-singularity of H k + 1 if condition (1.47) is avoided. A development 
analogous to the preceding yields 


where 


and 


H k + 1 = H k (I - au v T ) 


(z k ) T H k A f k 
u = a k f k + A f k 

v = (H k ) T z k 


det (I “ cruv T ) 


( z k ) T A x k 
(z k ) T H k A f k 


(1.59) 


(1.60) 


(assuming (z k ) T H k Af k ^ o). 

Thus the non- singularity of H k implies the non-singularity of H k + 1 if (z k ) T Ax k ^ 0. 


Even though Broyden's Method 1 assures that H k+1 is non-singular, as does the 
Rosen-Barnes (Secant) Method with the additional assumption of (z k ) T Ax k ^ o, 
neither method insures that H k exists, since the denominator in each algorithm 
could approach zero. In Chapter II where some new variants of the Secant Method 
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are presented, it will be shown that the algorithm denoted as Algorithm II 
insures that H k both exists and is non-singular. 

There are many other methods for solving f (x ) = o including two-point secant 
methods (as opposed to the (n + l)-point secant method which is simply called 
the Secant Method in this report) and Steffensen's methods [ 16] , [24] . Both of 
these classes of methods exhibit super linear convergence (1.61 and 2 respec- 
tively). But, like the discrete Newton's Method, they generally require n + 1 
function evaluations and a matrix inversion per iteration. The system f (x) = o 
can also be solved by any of a number of minimization techniques [ 16] , since 
solution of f (x) = o is equivalent to minimizing cp (x) = f T (x) f (x) (to zero). 

A relatively simple minimization technique is the Fletcher-Powell-Davidon 
Method [ 7] . Finally, there are more complex methods, such as the Freudenstein- 
Roth Method [8] , for solving f (x) = 0 in the difficult case where the Jacobian is 
singular at points in any region connecting the starting estimate with the solution. 
The other methods mentioned are ineffective in this case. 

The remainder of this report will concern itself with some new variants of the 
Secant Method. These variants were developed to overcome the previously dis- 
cussed limitations of existing versions of the Secant Method. 
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SECTIONS II 


ALGORITHMS I AND II 

2.1 INTRODUCTION 

Algorithms I and II are variants of the Secant Method for solving f (x) = 0, n 
nonlinear equations in n unknowns. These algorithms are designed to avoid 
some numerical problems that can cause poor convergence in the Secant Method. 
The complex series of steps leading to their development will be omitted; rather 
the final form of the algorithms will be presented and analyzed. 

Both algorithms follow the quasi-Newton iteration 

x k + 1 - x k = A x k = - a k H k f k (2.1) 

f k = f(x k ) 

where a k is either selected as unity (full step version) or else so as to insure 
that 

qpk+i < <pk (2.2) 

where 

cp k = (f k ) T f k . 

The matrix H k , representing an estimate of the inverse Jacobian at x k , is formed 
to satisfy 

A x. k = H k A f . k , i = 1, 2, . . m k (2.3) 

< n. 

The vectors Ax k are certain previous difference vectors Ax> , j < k , formed 
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via (2.1). The vectors Af k are the corresponding difference vectors Af i formed 
via the system function f (x). 

The set of m k difference vectors Ax k are not necessarily those corresponding 
to the last m k + 1 iterates x > , as would be the case in existing versions of the 
Secant Method. Rather they are formed from those most recent iterates x* 
such that certain conditions are met. In Algorithm I, the governing condition is 
that the m k vectors A f k are linearily independent. This insures the existence 
of H k , In Algorithm II, the governing conditions are that the m k vectors Ax k are 
linearily independent as well as the corresponding set of vectors A f ^ . This 
insures that H k both exists and is non-singular. 

The formation of H k in each algorithm is described in the following sections. 


2.2 ALGORITHM I 


Algorithm I consists of the combination of Algorithm IB (when < n) followed 
by Algorithm IA (when m k = n). The matrix H k in Algorithm I is updated as 
per 


(A x k - H k Af k ) (b. k ) T 

H k + 1 = H k + 1 — 

(b k ) T A f k 


(2.4) 


provided that the vector b . k satisfies 

a I (b k ) T A f k | 

BY = > 

J II b k || || A f k || 


(2.5) 


where is a preselected constant in the interval (0, 1). Otherwise H k is re- 
tained unchanged (i.e., H k + 1 s H k ). The formation of b! 1 in Algorithm I is ac- 
complished as follows. 
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In Algorithm IA (used when m k = n), bJ* is formed from the j -th row of (AF k )" 1 
where AF k is the n x n matrix formed from n previous vectors A f k i.e., 


(b! 1 )* = j-th row of (AF k ) _1 (2.6) 

where 

AF k = [Af k Af k . . . Af k ] (2.7) 


The integer j is selected as the first integer in the sequence j = 1, 2, etc., such 
that criterion (2.5) is satisfied. The new matrix AF k+1 is formed from AF k by 
replacing A f k in the latter matrix with the current vector A f k and permuting 
the resultant matrix so as to place Af k in the last column, i.e., 


where 


and 


A F k + i = [AP< + (A f k - A f k ) (e.) T ] ) T 


(P.)T £ [e t . . . Vj e j + 1 . . . e n e.] n „ n 


(e.) T § [0 . . . 0 1 0 . . . 0] lxn . 


( 2 . 8 ) 


(2.9) 


(1 in i-th column) 


The matrix A F k + 1 is not explicitly needed in Algorithm IA but rather the matrix 
(AF k + 1 ) _1 . This matrix is readily calculated from 


(AF k + I )" 1 = P 


(AF k ) _1 + 


( ej - (A F k )“ 1 A f k ) (b^) 1 
(b. k ) T A f k 


( 2 . 10 ) 


where Pj , e j are as defined in (2.9). 

/yj C.J 

p. M - 2-3 

' j 



Equations (2.4), (2.6), and (2.10), in conjunction with (2.1), constitute Algorithm 
IA in the case where a suitable vector b k can be determined, i.e., one satisfying 
criterion (2.5). In this case, the vector pair (Ax k , Af k ) becomes one of the n 
pairs satisfying (2.3) at k + l. Satisfaction of criterion (2.5) has insured that 
A f k is "sufficiently" independent of the remaining (n - 1) vectors retained in 
AF k + 1 .. 

If a suitable vector bj' cannot be determined, then H k and (AF k ) -1 are retained 
unchanged (i.e., H k+1 s H k and (AF k+1 ) -1 » (AF k ) _1 ) and the iteration is con- 
tinued with (2.1). In this case, the vector pair (Ax k ,Af k ) does not become one 
of the pairs satisfying (2.3) at k +1. 

In order to be initiated at some step k 0 , Algorithm IA requires n previous 
vector pairs (Ax k °, Af k ° ), i = l to n. Assuming the vectors A f. k ° are inde- 
pendent, the corresponding matrix H k ° can be determined from (2.3). The use 
of Algorithm IA then insures that at every subsequent step k > k 0 , the vectors 
Af. k of equation (2.3) are independent. This insures that H k is well defined 
(although it may become singular). If criterion (2.5) is ignored (i.e., by letting 
j = 1 in (2.6) at every step), the resultant algorithm (denoted as IA-S) is essen- 
tially the same as the quasi-Newton version of the Secant Method described 
e.g., in [16] . In this case, there is no assurance that H k exists or is non- 
singular. 

When m k < n , Algorithm IB is utilized to form b k . This algorithm permits the 
iteration to start with only initial estimates x 1 of the solution and H 1 of the in- 
verse Jacobian. After n or more iterations, Algorithm IB will have formed n 
vector pairs (Ax k °, Af k °) in which the independence of the vectors A f k ° is 
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assured. Thus Algorithm IA can take over at k = k Q as previously discussed. 
Algorithm IB can be described as follows. 

At any step k, where < n , let matrices B j' and D k be represented as 


B i k 


A 


(b k ) T 


(b m k k ) T 


D k = 


- 


A ff 


. A f k” 

mk J n. 


( 2 . 11 ) 




n^xn 


where 


The vectors Afk, i = l to m k , represent m k independent vectors Af m , m < k, 
that were previously formed and retained in D k . Form the vector b k by some 
orthogonalization method (e.g., Gram-Schmidt orthogonalization) so as to be 
orthogonal to the m k vectors Af k in D k . if the vector b. k thus formed satisfies 
criterion (2.5), then form H k + 1 as per (2.4) and form B k + l , Df* 1 as 


B k + 1 


B k ,0 


(b k ) T 


(b k ) T A f k 


( m k + 1 ) * n 


( 2 . 12 ) 


where 
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B k A fk (b. k ) T 


B 1.0 


£ B k - 


(b.k) T 


A f k 


and 


Dk + 1 


[“} 



n« (m k + l) 


(2.13) 


It is easy to show that the matrices Bf* 1 and Dj E+1 thus formed satisfy 


where 


Rk + l D.k + i = I 

1 1 m k+l 

m k + 1 “ m k + * • 


(2.14) 


Moreover, satisfaction of criterion (2.5) has insured that Af k is "sufficiently" 
independent of the remaining m k vectors retained in D k + 1 . 

If the vector b. k as formed by orthogonalization fails to satisfy criterion (2.5), 
then (b k ) T is selected as the j -th row of B} 1 where j is the smallest integer be- 
ginning with 1 such that criterion (2.5) is satisfied. If such an integer exists 
(1 £ j 1 m k ), then the associated vector b. k is used in equation (2.4) to form 
H k+ 1 , while B k + 1 andD| c + 1 are calculated from 


B k+1 = P. 

1 J 


|B k 


(e j - B k A fk) (b.k) 1 
(b k ) T A fk 


(2.15) 


and 


D k + i = |- D k + (Afk - Afk) (e .) T ] (R) T (2.16) 
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where 


and 


(Pj) T S [ ei ... e,„ ... e„ k e,]^ 


(2.17) 


(e.)7 I [0... 0 1 0... 0] 


(1 in i-th column) 

It is easy to show that the matrices Bj c + 1 and D k+1 thus formed satisfy 


where 


B k+1 D k + i 


I 


m k+l 


m k+ 1 m k • 


(2.18) 


This alternate procedure for forming bj* implies that A f k is used to replace 
some earlier vector Af k in forming D k + 1 . At the same time, the independence 
of the m k vectors comprising Df* 1 is assured. 

If an acceptable vector b k cannot be determined by this alternate procedure, 
then the procedure is as follows. Let 

H k+1 = H k 

B i +1 = B i k (2.19) 

Dk + i = D k 

and continue the iteration with (2.1). 

If and when a point k = p is reached such that m k + 1 = n, then the matrix Bp + 1 
will satisfy 
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( 2 . 20 ) 


BJ >+ 1 = (DP +1 ) -1 s (AFP + 1 ) -1 

where AFp + 1 contains n independent vectors Af m . At k = p + 1 = k 0 , the 
iteration can be continued with Algorithm LA (which is essentially the Barrs as 
the alternate procedure just described for forming b^, with m k = n). 

To start Algorithm IB (at k = 1), simply treat Bj 1 as an empty row vector and D* 
as an empty column vector, i.e., nij = 0. Then let = A f 1 (as would be the 
case for example using Gram-Schmidt orthogonalization) , and this obviously 
satisfies criterion (2.5). 

It still remains to be shown that the matrix H k , as formed by Algorithm I (i.e., 
Algorithm IB followed by IA), satisfies equation (2.3). This will be shown next. 

Consider the formation of b k in Algorithm IB at each step k at which an accept- 
able b. k can be formed. If b. k is formed by the orthogonalization procedure, 
then it is orthogonal to the previous vectors Af k comprising Df . Expressed 
another way, b k is orthogonal to all vectors retained in Df* 1 except Af k . If b. k 

is formed by the alternate procedure (as the j -th row of Bj' ) , then again it is 

orthogonal to all vectors retained in D k + 1 except A f k (by virtue of B k D k = I mk ). 
In either case then, one has 

(b k ) T Af. k + i = 0, i = 1, 2 (m k + 1 - 1) (2.21) 

since Algorithm IB places Af k in the m k + 1 column of Df + 1 . 

The vectors Af , k+1 of equation (2.21) were formed at some previous instants k. , 
that is 

A f k + 1 = Af k i, i = 1, 2 K + i * 1) (2-22) 
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where 


1 < kl <k 2 < ... <k <s+i . 1} <k. 

Then equation (2.21) can be expressed as 

b k A f k i =0, i = 1, 2 K + i * !)• (2.23) 

From equation (2.4) of Algorithm IB, one obtains 

H k + 1 A f k = Ax k (2.24) 

for every value of k from 1 to p at which an acceptable vector b k is formed. 

(The step k = p in Algorithm IB is defined by m k + 1 = n.) Also from (2.4), 

H k+1 Af k i = H k A f k i , i = 1, 2 <iTik +1 - i) (2.25) 

using (2.23). Equation (2.25) implies that 

H k+i A fk! = H k Af k i = H k_1 Af k i = ... = H k i + 1 Af k i (2.26) 

or 

H k + 1 Af k i = Ax k ‘ , i = 1, 2, .... m Ve + 1 - 1 (2.27) 

using (2.24). 

Using the identity of (2.22), equation (2.27) becomes 

H k + 1 Af k + i = Ax. k + 1 , i = 1, 2 tn k + j - 1 . (2.28) 

Since it has been assumed that an acceptable vector b k has been formed at 

step k, Algorithm IB has placed Af k in the m k + 1 column of B k + 1 , i.e., Af k = f k ^ . 

Hence, from (2.24), 
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(2.29) 


H k+ 1 Af k+ 1 

m k+ 1 


Ax k + 1 , 
1 


Combining (2.28) and (2.29) yields 


H k+i Af k + i = Ax k+1 , i = 1, 2, .... m k + j (2.30) 

- for every step k at which an acceptable vector b k has been formed. 

At any step k at which an acceptable vector b k cannot be determined. Algorithm 
IB utilizes equation (2.19) which implies that (2.30) is trivially true at this step 
as well. Hence, equation (2.30) is true at every step k, which verifies that equa- 
tion (2.3) holds for Algorithm IB. The proof for Algorithm LA follows directly, 
since Algorithm LA can be derived from the alternate procedure of Algorithm IB 
by letting m k = n. 

Note that Algorithm IB has the same weakness as IA in that there is no assurance 
that the vectors Ax 1 , corresponding to the retained independent vectors Af 1 , axe 
themselves independent. Thus even though H k is always defined, it may become 
singular. Once this happens, say at k = q. Algorithm I will not be able to con- 
verge to a solution x of f (x) = o unless the vector x 1 * - x happens to be in the 
subspace spanned by the columns of H^. This can be shown by the following, 
which is similar to the development in [5] . 


In both Algorithm LA and IB, H k+1 

H k + i = H k + 


= H k 


can be written as 

(Ax k - H k Af k ) (b k ) T 
(b k ) T Af k 

( a k fk + Af k ) (b k ) 
(b k ) T Af k 



(using (2.1) ) 
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or 


where 


Then 


H k + i = H k L k 



(a k f k + Af k ) (b k ) T 
(b l 4 ) T Af k 


H k+1 = H 4 ! L q L q+ 1 ... L k 

where Ho is assumed to be singular. Hence from (2.1), 

Ax k + 1 = - a k + 1 H k+1 f k + 1 


or 


where 


Ax k + l = Hiv 

v = - a k+1 LoLo +1 ... L k f k+1 . 


(2.31) 


(2.32) 


(2.33) 


Thus Ax k+1 , which is some linear combination of the columns of Ho, is confined 
to the subspace spanned by these columns , and this is true for every k > q. It 
is also true obviously for Ax q so that the iteration cannot possibly converge to 
a solution x unless (x q - x) happens to lie in this same subspace. 

If criterion (2.5) is ignored, the resultant algorithm (denoted as IB-S) is similar 
to Zeleznik’s construction [26] . The independence (or lack of independence) of 
the vectors A f 1 is ignored in this construction. Thus H k may not be defined 
(in addition to having the risk of becoming singular as in Algorithm IB), causing 
obvious numerical problems. The combination of Algorithm IB-S followed by 
LA-S forms a logical Secant Method algorithm, denoted as I-S, in which the inde- 
pendence of the vectors Af* is consistently ignored. 
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Algorithm n, to be discussed next, assures that the retained vectors Ax* are 
independent, as well as the corresponding vectors A f». Thus the matrix H k 
remains well-defined and non-singular. 


2.3 ALGORITHM H 


Algorithm II consists of the combination of Algorithm CEB (when < n) followed 
by Algorithm HA (when m k = n). The matrix H k in Algorithm II is updated as 
per 

(A x k - H k A f k ) (a. k ) T H k 

H k+ 1 = H k + (2.34) 

(a. k ) T H k A f k 


provided that the vector a k satisfies 


a l(a, l > T A**! 

7 ‘ = llaj-l, l|A* k ll 2 ” 


(2.35) 


and 


/3 k £ 


( aj k ) T H k A f k | 


(H k ) 


T a. k I 


I A f k l 


(2.36) 


where p t , p 2 are preselected constants in the interval (0, 1). Otherwise H k is 
retained unchanged (i.e., H k+1 = H k ). The formation of a k in Algorithm n is 
accomplished as follows. 

In Algorithm HA (used when m k = n), a k is formed from the j -th row of (AX k ) -1 
where AX k is the n x n matrix formed from the vectors Ax? 1 , i.e., 

(a. k ) T = j-th row of (AX 1 *) -1 (2.37) 
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where 


A Xl* = [A xf A x 


• Ax nl 

n J nx n 


(2.38) 


The integer j is selected as the first integer in the sequence j = 1, 2, etc., 
such that criteria (2.35) and (2.36) are both satisfied. The new matrix A#* 1 is 
formed from AX k by replacing A x! 1 in the latter matrix with the current vector 
A x k and permitting the resultant matrix so as to place Ax k in the last column, 
i.e., 


A X k + 1 = [AX'* + (Ax k - A x. k ) (e. ) T ] (P. ) T 


(2.39) 


where Pj and e j are as defined in (2.9). 


The matrix AX’ 1 * 1 is not explicitly deeded in Algorithm HA but rather the 
matrix (A X k + ’)“ 1 . This matrix is readily calculated from 


(AXic* 1 )" 1 




(AX k + 1 ) _1 



(A X k ) -1 Ax k ) (a^) 7 
(aj k ) T A x k 


(2.40) 


where P. and e . are as defined in (2.9). 

Equations (2.34), (2.37), and (2.40), in conjunction with (2.1), constitute Algo- 
rithm HA in the case where a suitable vector a k can be determined, i.e., one 
satisfying criteria (2.35) and (2.36). In this case, the vector pair (Ax k Af k ) be- 
comes one of the pairs satisfying (2.3) at k + 1 . If a suitable vector a . k cannot 
be determined, then H k and (AX k )“ 1 are retained unchanged (i.e., H k+1 s H k 
and (AX k+1 ) -1 = (AX k ) -1 ) and the iteration is continued with (2.1). In the latter 
case, the vector pair (Ax k , Af k ) does not satisfy (2.3) at k + 1. 
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A simpler algorithm naturally results if the criteria for independence, (2.35) 
and (2.36), are ignored. This algorithm, denoted as IIA-S, is obtained by letting 
j = 1 in equation (2.37). It is another Secant Method algorithm similar to 
Algorithm IA-S. Again, the existence and non-singularity of H k are not assured 
in this algorithm. 

As was the case with Algorithm IA, Algorithm IIA requires an initial set of n 
vector pairs (Ax k °, Af ko ), i = 1 to n. Again, the vectors Af k ° must be inde- 
pendent in order to define H k ° from (2.3)^ In addition, the vectors A x. k ° must be 
independent in order that (AX k °)" 1 exist. The use of Algorithm IIA then insures 
that at every subsequent step k > k 0 , the vectors Af k are independent as well 
as the vectors Ax k . This insures that H k is both defined and non-singular. 

When m k < n , Algorithm IIB is utilized to form a k . This algorithm permits the 
iteration to start with only initial estimates x 1 of the solution and H 1 of the 
inverse Jacobian. After n or more iterations, Algorlthm^IB will have formed n 
vector pairs (Ax k °, Af k °) in which the independence of the vectors Ax k ° is 
assured as well as that of the vectors Af k °. Thus Algorithm IIA can take over 
at k = k 0 . Algorithm IIB can be described as follows. 

At any step k , where m k < n , let matrices A k and C k be represented as 


, A 

A k = 


( a i ) 


kNt 


( ak ) T 
V ™k ' 


c k - 


= [ AX 1 A <| n> 


(2.41) 


m k 


x n 
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where 


The vectors Ax >, i = 1 to m k , represent m k < n independent vectors Ax” ) 
m < k , that were previously formed (by this algorithm) and retained in C k . 

Form the current vector Ax k in the usual way, that is from equation (2.1). Then 
form the vector a k by some orthogonalization method (e.g., Gram-Schmidt 
orthogonalization) so as to be orthogonal to the m k vectors Ax^ in C k . 

If the vector a k thus formed satisfies criteria (2.35) and (2.36), then form H k + 1 

1 CL, 

as per (2.34) and form A k + 1 , D k+1 as 


A k + 1 


A k 

M i, o 


( a j k ) T 

(a. k ) T Ax k 


— I + 1 ) * n 


(2.42) 


where 


A k 

n l ,0 


A 


A k - 


A k Ax k (a. k ) T 
(a k ) T Ax k 


and 


r k+ i 
^ i 




( m k + 


1) 


(2.43) 


It is easy to show that the matrices A k+1 and C k + 1 thus formed satisfy 

Af'Cr 1 = I..„ (2-44) 
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where 


m k+l m k + * • 


If the vector af as formed by orthogonalization fails to satisfy criteria (2.35) 
and (2.36), then select af 1 as 

(af) T = (e^ ) T Af = j-th row of Af (2.45) 


where j is the smallest integer (1 < j < m k ) such that criteria (2.35) and (2.36) 
are both satisfied. Then, using this j and associated vector af, form H k+1 as 
per (2.34) and form Af + 1 , Cf + 1 as j t ’ ^ 


Af + 1 


= P. 


Af + 


(e^ - Af Ax k ) (af ) T 
(a k ) T Ax k 


(2.46) 


and 


Cf + i = [Cf + (Ax k - Axf) ( ej ) T ] (P. ) T (2.47) 


where Pj and e i are as defined in (2.17). It is easy to show that the matrices 
Af + 1 and Cf + l thus formed satisfy 


Af + 1 Cf + i 


I 

m k + l 


where 


m k + 1 = m k • 


(2.48) 


If an acceptable af (satisfying criteria (2.35) and (2.36) ) cannot be found by this 
alternate procedure, then let 
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and continue the iteration with equation (2.1). 

If and when a point k = p is reached such that m k + j = n , then one has 

Af +1 = (Cp +1 )-i = (AXp + 1 )“ 1 (2.50) 

where AXp +1 contains n independent vectors Ax* . At k = p + I s k 0 , the 
iteration can be continued with Algorithm HA (which is essentially the same as 

the alternate procedure just described for forming a. k , with = n). 

d 

To start Algorithm IIB (at k = 1), one treats A/ and as empty, i.e., nij = 0. 

Then try a. 1 = Ax 1 (as would be the case using Gram-Schmidt orthogonalization), 

5 4 

and this obviously satisfies criterion (2.3#). Usually, criterion (2.35) would also 
be satisfied by this choice. If not, H 2 would be set equal to H 1 (as per (2.49)) 
and the iteration continued (A 2 and C 2 would remain empty in this case). 

Note that satisfaction of criterion (2.35) in Algorithm IIB implies that Ax k does 
not lie almost entirely in the subspace of the previous (m k + 1 - 1) vectors Ax* 
that are to be retained in C ^ + 1 . This follows from the fact that a k , whether 
formed by orthogonalization or by the alternate procedure of equation (2.45), is 
orthogonal to these (m k + 1 - 1) vectors. Hence if a j* is not orthogonal to A x k , 
then Ax k cannot be a linear combination of these (n^ + j - 1) vectors. Similarly 
satisfaction of criterion (2.36) insures that Af k does not lie almost entirely in 
the subspace formed by the previous (m k+1 - 1) vectors Af* corresponding to 



the above Ax' . This follows from the fact that the vector (H k ) T aH is orthogonal 
to these (m k + 1 - 1) vectors Af' as shown by the following. 

(a. k ) T H k A f ‘ = (a k ) T Ax i (using (2.3)) (2.51) 

= 0 . 

A simpler algorithm than IIB naturally results if the criteria for independence 
(2.35) and (2.36) are ignored. This algorithm, denoted as Algorithm IIB-S, con- 
sists of equations (2.1), (2.34), and (2.42). Like (Secant) Algorithm IIA-S, it 
would encounter numerical problems whenever the vector A x k (A f k ) failed to be 
sufficiently independent of the previous vectors Ax' (Af 1 ), o <k - i < n. The 
combination of Algorithm IIB-S followed by IIA-A is denoted as Algorithm II-S. 
It is essentially the same as the Rosen-Barnes (Secant) Method [1] , [18] , al- 
though for k > n it involves fewer computations per iteration than the latter 
method. 

The proof that H k , as formed by Algorithm II, satisfies equation (2.3) is directly 
analogous to that given for Algorithm I and so will not be repeated. 
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SECTION III 


LOCAL CONVERGENCE OF ALGORITHMS I AND H 

First the conditions insuring local convergence of the regular Secant Method 
will be reviewed. Assume that 


a. f (x) has continuous second derivatives in some neighborhood of a 
solution x to f (x) = o. 


b. The Jacobian matrix J (x) 


o f ; 

B x. 

, J , 


is non-singular at the solution. 


Let 


d k = det 


8 x k “ n 
5 x k-n II 2 ' 


b X k_ 1 

11 S x k “ 1 1| 2 


(3.1) 


where 



i = k “ 1, . . . , k - n . 


Then Bittner [2] proved that local convergence of the Secant Method is assured 
when the following additional condition is met. 

Given u> such that 0 < w < 1, 

|d k l > co, for every k > n. (3.2) 

where d k is defined by (3.1). 

Tornheim [21] proved that under these same conditions, local convergence is 
superlinear of order at least that of the positive root of 

^n+l r \.n + l, (3 .3) 
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Some representative values of this root are 1.62 for n = 1, 1.18 for n = 10, 
and 1.03 for n = 100. 

Ortega and Rheinboldt [ 16] have generalized these results and relaxed require- 
ment (a) to f (x) having Lipschitz continuous first derivatives in some neighbor- 
hood of the solution. The matrix SX k (as defined by (1.22) ) is said to be a 
member of a class of uniformly non-singular matrices if condition (3.2) is 
satisfied. Assuming this to be the case, it is shown that J k (as given by (1.24) 
or (1.19) ) is a strongly consistent approximation to J (x k ) in that there exists 
constants c and r > o such that 

II J k - J(x k )|| < c||SX k || (3.4) 

when 

||SX k || < r . 

Under some additional assumptions that the iterates x k given by 

x k + 1 = x k - (J k ) _1 f k (3.5) 

are well defined and that II S X k || remains suitably small, it is shown that the 
iteration of (3.5) converges super linearly to a solution x of f (x) = 0, the order 
of convergence being at least equal to the positive root of (3.3). As a corollary, 
J k - J (x) as k - a) . 

The determination of whether (3.2) is satisfied at every step k > n is generally 
impractical since it involves a considerable amount of computations. In Algo- 
rithms I and II, comparatively simple criteria are utilized to determine the 
relative independence of the vectors comprising AX k and AF k . If these criteria 
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are initially satisfied at every step, these algorithms follow the same steps as 
the Secant Method. Under these conditions and with some additional assumptions, 
it will be shown that Algorithms I and II satisfy condition (3.2) and hence possess 
super linear convergence. When these criteria are not initially satisfied at every 
step, then as previously discussed these algorithms depart from the regular 
Secant Method. In so doing, the algorithms at least permit linear convergence 
as will be discussed later. 


First it will be shown that the condition of (3.2), under one additional assumption, 
is satisfied by the condition 


where 


is given, 


I A k | > Wj, for all k > n 

e (0, 1 ) 


A k = det 


Ax k ~ n 

l|Ax k " n H 2 


Ax k_1 \ 
|| Ax k_ 1 Il 2 / 


(3.6) 


and 


Ax 1 


£ 


x i + i 


X* . 


Note that 

det(bx k-n ••• Sx k_1 ) = det (Ax k ~ n ... Ax k_1 ) (3*7) 

since simple operations that leave the value of a determinant unchanged will 
convert one form into the other. 

Thus 
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(from (3.1) ) 


d k 


det (5 x k-n . .. 

g x k“"l) 

||Sx k_n |l 2 

|| 8 x k ~ 1 ll 2 

det (Ax k_n . . 

. . Ax k ~*) 

|| S x k “ n || 2 

II 8 x k-1 ll 2 

||A x k-n ll 2 .... 

II Ax te — 1 II 2 

|| S x k-n || 2 

|| 8 x k_I ll 2 


(using (3.7)) 


(3.8) 


using the definition of A k in (3.6). 

The left hand determinant in (3.7) represents the volume of the n -dimensional 
parallelopiped with a vertex at the origin and n sides S x‘ , while the right hand 
determinant represents the equal volume parallelopiped having a vertex at the 

origin and n sides Ax 1 , i = k - n, k - n + 1 , k - 1 . Assume that for 

every k > n, the ratio of the smallest side in the second representation to the 
largest side in the first representation is bounded below by a constant c lf i.e., 


llAx 1 II 2 

ii c .if ^ !»j» h n, k “■ n 1, • • • > k~l 

II 5 xJ II 2 

k > n (3.9) 

where 

c, e (0, 1] 

The bound Cj cannot exceed unity since Ax k_1 = Sx k_1 . Then if (3.6) is satisfied, 
(3.8) and (3.9) imply that 

|d k I > (c,) nM = a, w£(0,l) (3.10) 


showing that (3.2) is satisfied. 
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It will now be shown that satisfaction of criteria (2.35) and (2.36) in Algorithm IIB 
via the orthogonalization procedure at every step k < n, and satisfaction of these 
criteria in Algorithm IIA with j = 1 at every step k > n, insure that condition 
(3.6) is satisfied. This then would insure the super linear convergence of Algo- 
rithm n as well as the convergence of H k to J (x) where x is the solution to 
f (x) = 0. Note that satisfaction of the aforementioned criteria in the manner 
described means that Algorithm n would be following the same steps as Algo- 
rithm n-S which is equivalent to the Secant Method. 

At any instant k, k < n, Algorithm IIB-S has formed the vectors Ax 1 , Ax 2 , . . ., 
Ax k_1 , Ax k , as well as the vectors a. 1 , i = 1 to k, via an orthogonalization pro- 
cedure. The Gramian of the k vectors Ax*, i = 1 to k, is given by 


GCAx 1 , .... Ax k ) = 


(Ax 1 , Ax 1 ) .... (Ax 1 , Ax k ) 


(Ax k , Ax 1 ) .... (Ax k , Ax k ) 


(3.11) 


As shown by Gantmacher [ 10] , this Gramian can be expressed as 

□ (Ax 1 , .... Ax k ) = G (Ax 1 , .... Ax k_1 ) (h k ) 2 (3.12) 

where h k is the Euclidean length of the component of Ax k that is orthogonal to 
the subspace spanned by the vectors Ax 1 , . . ., Ax k_1 . 

Now the formation of a k via Algorithm IIB-S has insured that a k is orthogonal 
to this same subspace. Further, it is assumed that criterion (2.35) is satisfied 
at every k, k < n, which implies that 
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> 


P\ l|Ax k ll 2 . 


(3.13) 


ICaf. Axk )l 



Suppose one constructs an orthonormal basis for the orthogonal complement to 
this subspace in which aH/ II af H 2 forms one basis vector. Then an interpreta- 
tion of (3.13) is that the vector Ax' 5 has a component greater than p x ||Ax k II 2 
along this unit vector, the latter being orthogonal to the aforementioned subspace; 
hence the component of Ax k orthogonal to this subspace must be greater than 
pjAx k II 2 , i.e., ' 


h k > p 1 ||Ax k ll 2 . 


(3.14) 


Then from (3.12), 

GCAx 1 , .... Ax k ) > (P x ) 2 (||Ax k |l 2 )2G(Ax 1 , .... Ax^ 1 ). (3.15) 

Since (3.15) is true for every k, k < n, one has 
GCAx 1 , Ax 2 ) > (Pj) 2 (|Ax 2 || 2 ) 2 GCAx 1 ) 

= (p x ) 2 (l|Ax»|| 2 ) 2 (||Ax 2 H 2 ) 2 

G(Ax 1 , Ax 2 , Ax 3 ) > (Pj) 2 ( II A x 3 II 2 ) 2 G(Ax 1 , Ax 2 ) 

> (p t) 4 ( || Ax 1 II 2 ) 2 (l|Ax 2 || 2 ) 2 ( ||Ax 3 || 2 ) 2 


and finally 


GCAx 1 , .... Ax") > (p i ) 2 (n-l> (llAx 1 !^) 2 ... ( ||Ax"|| 2 ) 2 • (3.16) 


Since 
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i 



AX n+ 1 


(3.17) 


= [Ax 1 Ax 2 ... Ax"] 


equation (3.11) implies that 

GCAx 1 , Ax n ) = det [(AX n+1 ) T AX n+1 ] 


= '[det (AX n+1 )] 2 . 


(3.18) 


Hence from (3.16) and (3.18), 

| det (AX n + 1 ) | > (pj)"' 1 llAxMlj ... || Ax n || 2 (3.19) 


det 




> (Pi )" -1 


or 

| A n + 1 1 > (Pi) n_1 p co 1 . (3.20) 

Thus criterion (3.7) is satisfied for k = n + 1. This result will now be extended • 
to all k, k > n. 

Similar to (3.12), one can express G(Ax k_n+1 , . . ., Ax k ) as 

G(Ax k_n + 1 , .... Ax k ) = G(Ax k_n+1 , .... Ax k_1 ) (h k ) 2 (3.21) 

where h k is the length of the component of Ax k that is orthogonal to the subspace 
formed by the n - 1 vectors Ax k_n+1 , . . ., Ax k_1 . Algorithm IIA-S forms a. k 
so as to be orthogonal to this subspace; hence satisfaction of criterion (2.35) 
produces the same inequality for h k as before, namely 
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ill ■■nim ii ■ i in i i i H i * 11 


h k > p x || A : 


(3.22) 


Hence, 

G(Ax k " n + 1 Ax k ) > (p^ 2 (|| Ax k || 2 ) 2 G(Ax k_n + 1 Ax k_1 ). (3.23) 

Now a k “ 1 > whether formed by Algorithm IIA-S or IIB-S, is orthogonal to the 
subspace formed by Ax k_n+1 to Ax k “ 2 (it may be orthogonal to a larger subspace), 
so that one obtains 

G(Ax k-n + 1 A x k " 1 ) > (p x ) 2 ( || Ax k_ 1 1| 2 ) 2 G(Ax k-n+ 1 Ax k_2 ) . (3.24) 

Continuing in this fashion, 

G(Ax k_n+I , .... Ax k-2 ) > (Pj ) 2 (l|Ax k_2 H 2 ) G(Ax k “ n+1 , . . Ax k-3 ) 


(3.25) 


G(Ax k-n + 1 , A x k-n + 2 ) > (p x ) 2 ( || Ax k_n + 2 || 2 ) G(Ax k-n + 1 ) 


( Pl ) 2 ( ||Ax k-n + 2 || 2 ) ( || A x k_n + ^ II 2 ) 


Combining the above inequalities yields 


G(Ax k_n + 1 , . . . , Ax k ) > (p 1 ) 2 ( n “ 1 ) (||Ax k " n+1 || 2 ) ... ( || Ax k || 2 ) . (3.26) 


Since 


G(Ax k ~ n+1 , .... A x k ) = [det (AX k + 1 )] 


(3.27) 


one obtains 
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det 


Ax k ’ n + 1 
I Ax k_n+ 1 II 


Ax k 
I Ax k II 


(P.i) n 


- 1 


k > n 


(3.28) 


det 


Ax k-n 
II Ax k_n || 2 


Ax k ~ 1 \ 

|| Ax k “ 1 1| 2 ) 


> (pj)"" 1 , k > n + 1 


or 

I A k | > (p^" -1 = ojj, k > n + 1. (3.29) 

Combining (3.29) with (3.20) shows that condition (3.7) is satisfied for all k , 
k > n , when the criteria of Algorithms IIB and IIA are initially satisfied as de- 
scribed. Then under these conditions and the previous assumptions, Algorithm II 
possesses local superlinear convergence of order at least that of the positive 
root of (3.3). 

The extension of these results to Algorithm I is fairly simple, although some 
additional assumptions are required as will be shown. First assume that cri- 
terion (2.5) in Algorithm IB is satisfied at each step k < n via the orthogonaliza- 
tion procedure and that this criterion in Algorithm LA is satisfied at each step 
k > n by j = 1 . Then in an analogous manner to the foregoing one obtains 


det 


A f k_n 

||Af k-n jj^ 


A f k_ 1 \ 

l|Af k -MlJ 


> (p,)"' 1 , k > n 


(3.30) 


where p t is a preselected number in the interval (0, 1). The matrices AX k and 
AF k satisfy 


3-9 



H k AF k 


(3.31) 


AX k = 


so that 


det (AX k ) = det (H k ) det (AF k ) . 

Then from (3.32), 


(3.32) 


det 


( 


Ax k_n 
|| Ax k-n || 2 


Ax k-1 \ 

|| Ax k " 1 1| 2 J 


|det(H k )| 


|| A f k_n || 2 . . . ||Af k -*|| 2 

|| Ax k-n || 2 . . . || Ax k “ 1 II 2 


A f k-n 
|| Af k_r> || 2 


A f k ~ 1 \ 

|| A f k “ 1 ll 2 y 


> l det ( Hk )l 

(l|H k || 2 ) n (Pl) ' 


(using (3.30) and (3.31)). (3.33) 


Assume that H k is non-singular. This is not assured in Algorithm I even though 
the satisfaction of criterian (2.5) assures the non-singularity of AF k . However, 
as previously discussed, if the vectors A f 1 comprising AF k are formed from 
iterates close enough to a solution x, then the independence of the vectors A f * 
assures the independence of the corresponding vectors Ax* (assuming J (x) is 
non-singular) and hence assures the non-singularity of H k . Assume further that 


ldet (H**)| > 

( || H k ll 2 ) n “ C2, 


k > n 


where c 2 e (0, 1] . 

Note that c 2 cannot exceed unity since for any matrix A, 


(3.34) 
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(3.35) 


and 


|det A| 



II A|| > p (A) for any matrix norm 
where p (A) = spectral radius of A. 

Then (3.35) can be written as 


det 


Ax k ' n 
| Ax k_n ||” 


Ax k ~ 1 \ 
| A x k_ 1 II” j 


> C 2 ( p ^"" 1 



(3.36) 


or 

I A k | > co v e (0, 1), k > n 

so that condition (3.7) is satisfied. Thus under the foregoing assumptions, Algo- 
rithm I has the same superlinear convergence if the associated test criteria are 
initially satisfied at every step. 

If the test criteria in Algorithm I or n are not initially satisfied in the manner 
described, then superlinear convergence is not assured. However, linear con- 
vergence is still possible if the matrix H k satisfies 

III - H k J(x)|| < 1 v (3.37) 

where J (x) is the Jacobian matrix at the solution x. This is shown by the fol- 
lowing. With a k = 1 , the next iterate using (2.5) is 


x k + i - x - x k -x-H k [J (x) (x k - x) + v e ] (3.38) 
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where v f accounts for the higher order terms in the Taylor expansion of 
f k = f (x k ) about the solution x . Then 

x k + 1 - x = [i - H k J(x)] (x k - x) - H k v 6 

and 

|]x k + 1 - x|| < III - H k J (x)H IU k - xll + II H k || ||vj| (3.39) 

where 

II vj = 0(||x k - x|| 2 ). 


Thus local linear convergence is assured if (3.37) is satisfied. 

A prerequisite to satisfying (3.37) is that H k be non-singular (it has already 
been assumed that J (x) is non-singular). For if H k is singular, then so is 
H k J (x), and I - H k J (x) would have a characteristic root equal to unity since 


det ^A.I-(l-H k J (x) ) j = 0 

when 


(3.40) 


X = 1 . 


Then the spectral radius of I - H k J (x) would be at least equal to unity implying 
that 

III - H k J(x)|| > 1 (3.41) 

for any matrix norm. Algorithm II inherently insures that H k is non-singular. 
However Algorithm I does not, although as previously discussed, H k will be non- 
singular in Algorithm I as well if the vectors Af 1 comprising AF k are formed 
from iterates close enough to a solution x. 
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Even though H k is non-singular in Algorithm n, and assumed to be so in Algo- 
rithm I, there still is no assurance that (3.37) will be satisfied. Heuristically 
though, one would expect this to be the case if for example the initial estimate H 1 
is a good approximation to J (x) and if the problem is not severely nonlinear. In 
any event, the conditions under which Algorithms I and n depart from the Secant 
Method are those tending to cause poor convergence of the latter. The alterna- 
tives followed by Algorithms I and II in these circumstances at least retain the 
potential for linear convergence. 
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SECTION IV 


NUMERICAL EXPERIMENTS 

In Section II , variants of the Secant Method are presented consisting of Algo- 
rithms I and n. It is claimed that these algorithms are an improvement, in 
terms of convergence, over corresponding existing versions of the Secant 
Method. In order to test this hypothesis, a series of randomly generated test 
problems have been run, each via several different methods. The problems 
were rim on an APL/360 computing system having a precision of approximately 
16 decimal digits. Programming and other details are given in Appendix 2. 

The type of problem investigated is the same as that considered by Fletcher and 
Powell [ 7] and by Rosen [ 18] , namely 

n 

f i( x ) = E i ~ ( a ij sin Xj +b ij cosxj) = 0 (4.1) 

j*i 

i = 1, 2, . . . n . 

This can be expressed in vector form as 

f (x) = E - (A sin x + B cos x) = 0 (4.2) 

where the notation sin x (or cos x) refers to the n -vector whose i -th component 
is sin Xj (or cos Xj ), where x £ is the i -th component of x. The components of 
the n x n matrices A and B are random integers in the range -100 to +100. The 
n -vector E is obtained from 

E = A sin x + B cos x (4.3) 

where x is an n -vector each of whose components is a random number between 
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-77 and +77 . Thus x is automatically one solution to (4.2). There are generally 
multiple solutions to (4.2) as noted in [ 18] . For each problem, the starting 
estimate x 1 is generated as 

x 1 = x + 0. 1 S (4.4) 

where S is also an n -vector whose components are each random numbers be- 
tween -77 and +77. In a sense then, the starting estimate x 1 is close to a solu- 
tion x although there may be other solutions in the vicinity. 

Five independent problems were generated for each value of n (see Appendix 2 
for details). The values of n considered were 2,5, 10, and 15 for a total of 
20 problems. Each problem was run with eight different algorithms. Besides 
Algorithms I and II, these included a discrete Newton's Method, Secant Algo- 
rithms I-S and II-S, Broyden's Methods 1 and 2, and a combination of Broyden's 
Method 1 and Algorithm II-S. Newton's Method was selected to serve as a 
standard against which the efficacy of the other algorithms could be judged. 

Secant Algorithm II-S is equivalent to the Rosen-Barnes version of the Secant 
Method, although as noted previously in this report, it involves fewer computa- 
tions per iteration when k > n. Secant Algorithm I-S is equivalent to a form 
suggested by Zeleznik [26] , or to the quasi-Newton form of the Secant Method 
described in [ 16] . Thus Algorithms I-S and II-S provided a basis for judging 
the claim that Algorithms I and II are improved variants over corresponding 
versions of the Secant Method. Broyden's Methods 1 and 2 were included because 
of their simplicity and because of the initial similarity of Method 1 to Algorithm]! 
(or n-S) and of Method 2 to Algorithm I (or I-S) . Finally a combination of Broy- 
den's Method 1 and Algorithm II-S was included because of Rosen's determination 
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that such a combination generally yielded better results in his series of 
problems than either method alone. 

When Algorithm I or n was utilized, it was necessary to specify a value of p x . 
Each time these Algorithms were employed, two or more runs were made with 
different values of p x , in order to obtain some measure of the effect of p x on 
convergence. Concurrently, an attempt was made to develop a rule of thumb for 
selecting a nominal value of p x that would yield good results. Section in pro- 
vided some insight on the effect that p x has in establishing a lower bound for 
the ill-conditioning of Ax k . This together with initial experimentation resulted 
in the selection of some "nominal" values of p x as n x 10~ 3 and 2n x 10" 3 . The 
results using these values of p x are listed in subsequent tables. 

When Algorithm n was used, it was also necessary to specify a value of p 2 . In 
order to avoid a further increase in the number of runs, which was already con- 
siderable, only one value of p 2 was specified for each value of p x . The specifica- 
tion was arbitrarily set at P 2 = 0.1 p x . The parameter p 2 is used in criterion 
(2.36) to insure the independence of the vectors A f 1 that are retained for subse- 
quent use. However when m k < n, criterion (2.36) is a sufficient but not neces- 
sary condition for assuring this independence. In other words, criterion (2.36) 
could fail even though the vectors Af» were sufficiently independent. In order 
to reduce the probability of this unnecessary rejection of a. k as formed by 
orthogonalization, it was deemed desirable to set p 2 lower than p x . At the same 
time a value of p 2 > 0 oA.il considered necessary to avoid ||H k || - °°. 

In the combination of Broyden's Method 1 and Algorithm II-S, the vector z k in 
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H k + 1 


(4.5) 


Hk + (Ax k ~ H k A f k ) (z k ) T H k 
(z k ) T H k Af k 

is formed as an, average of the vectors in the individual methods. When each of 
the above methods is used by itself, only the direction of z k is important since, 
a constant multiplier has no effect. But when an average of two different vectors 
is performed, the relative scaling assigned to each is important since it affects 
the direction of the resultant average. Since this scaling was not explicitly de- 
scribed in Rosen's paper [ 18] , the author decided on the magnitude scaling in- 
herent in each method as it is described elsewhere in this report. Thus in 
Broyden's Method 1, z k is set equal to Ax k , while in Algorithm 1-S z k is the 
component of Ax k that is orthogonal to the subspace formed by the previous 
vectors Ax 1 , 0 < k - i<n. In this scheme, the ratio of the magnitude of z k as 
produced by Algorithm II-S to that produced by Broyden's Method 1 is a maximum 
of unity when A x k is orthogonal to the aforementioned subspace and becomes zero 
when Ax k lies in this subspace. 

All of the algorithms that have been listed could be utilized either in a full step 
version (a k = 1) or in a reduced-step version (a k selected to minimize or reduce 
cp k + 1 ). Only the full step version was used in these experiments for the following 
reasons. Methods that select a k on the basis of minimizing cp k + 1 are somewhat 
inefficient in terms of function evaluations. Secondly, there are a great many norm- 
reducing schemes. To utilize even just a few of these on top of the run permuta- 
tions already considered would result in a morass of data in which the basic pur- 
pose of the investigations would be masked. Certainly the relative efficiency of 
norm-reducing methods is peripheral to this investigation. The basic purpose of 
this experimentation is to test the convergence characteristics of Algorithms I 


4-4 



and II near a solution and to compare them with those of Secant Algorithms I-S 
and II-S among others. This comparison is facilitated by the use of a single gen- 
erally accepted method of selecting the step size and certainly a k = l qualifies. 

Each algorithm considered requires an initial estimate H 1 of the inverse Jacobian 
as well as an initial estimate x 1 of the solution. Algorithms I and II, as well as 
I-S and II-S, could function with any initial estimate H 1 provided a norm -reducing 
method for selecting the step size were included. These algorithms, for a linear 
system, generate the system inverse Jacobian after at most n iterations. For a 
non-linear system, these algorithms would generate an approximation to the in- 
verse Jacobian at x 1 within n iterations, provided the step size were not so 
great as to drive the iterate to a locale with a markedly different Jacobian. A 
norm-reducing constraint would generally prevent such deviations. However, 
norm-reducing schemes were not included in this numerical experimentation for 
reasons previously discussed. Secondly, starting with an arbitrary H 1 (e.g., 

H 1 = I) provides little if any saving in computations since the n iterations re- 
quired to develop a good approximation to the inverse Jacobian at x 1 are com- 
parable to the computational effort in directly calculating an approximate J _ 1 (x). 
For these reasons, the initial estimate H l in this investigation was uniformly 
generated by computing a discretized approximation to J (x 1 ) and obtaining its 
inverse. This selection also has the effect of not penalizing Broyden's Method 
unnecessarily; the latter is at its best when H 1 % j" 1 (x 1 ). 

The use of the full-step version for all of the algorithms, as opposed to a reduced- 
step norm-reducing version, raises the possibility that in some cases the iteration 
will fail to converge to a solution for any of the methods. To accommodate this 
possibility the following strategy was adopted. If all of the quasi-Newton 
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f 


algorithms (i.e., all except the discrete Newton's Method) failed to converge to 
a solution, the starting estimate x 1 given by (4.4) would be changed to 


x 


1 



0.1S 


(4.6) 


where q assumed the values 1,2, etc., until at least one of the quasi -Newton 
methods converged to a solution. 

The effect of (4.6) is to bring the starting estimate x 1 closer and closer to a 
solution x so as to increase the probability of convergence. The reason for 
adopting this strategy is that the main purpose of the investigation is to compare 
the local convergence of Algorithms I and n with that of the other quasi-Newton 
methods, especially that of Secant Algorithms I-S and n-S. The latter methods 
will at times encounter difficulties after reaching points very close to a solution; 
it is primarily this type of problem that Algorithms I and EL were designed to 
overcome. An evaluation of the region of convergence of the various methods is 
beyond the scope of this investigation. In the actual numerical work, equation (4.6) 
had to be utilized only once, in problem 55, when all of the quasi-Newton methods 
failed to converge. The use of q = 1 in (4.6) resulted in all of the methods con- 
verging to a solution in this problem. 


In judging the effectiveness of the various algorithms in this particular series of 
problems , the primary criterion used is the number of vector function evaluations 
required by the algorithm to converge to a solution. This criterion, which has 
been used by many authors, is simple to apply and provides a fair representation 
of total computational effort especially if the function is at all complicated. In 
the discrete Newton's Method, n + 1 function evaluations are required at each 
step except the last, where only one evaluation is required. The other 
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(quasi-Newton) methods require but one function evaluation per step except for 
the first step (k = i) which requires n + l evaluations to establish the initial 
estimate H 1 . Thus denoting the final iterate index as k f (at which <p kf is less 
than the convergence criterion), the total number of function evaluations is 
given by 

No. of f evaluations = (k f - 1) (n + 1) + 1 (4.7) 

= k f (n + 1) - n 

for the discrete Newton's Method, and by 

No. of f evaluations = (k f - 1) + (n + 1) (4.8) 

= k f + n 

for the quasi-Newton methods. The convergence criterion in all cases was 
arbitrarily set at 

llf k ll 2 < 10" 6 

or 

cpk § (f k ) T f k < 1CT 12 , (4.9) 

A secondary criterion used to judge the effectiveness of the various algorithms 
is the accuracy of the final matrix H kf generated by each algorithm, since the 
accuracy of the estimated inverse Jacobian at the solution is of importance in 
some applications. Also the accuracy of H kf is a distinguishing characteristic, 
especially in those cases where convergence is poor. The standard used for 
comparison is the inverse of the estimated Jacobian at the solution as determined 
by the same discrete process used in the discrete Newton’s Method. Denoting the 
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latter matrix as J“ 1 (x), the error criterion used is the maximum absolute 
component value of the error matrix (H kf - J _1 (x) ). 

For purposes of comparison, it is convenient to divide the quasi-Newton algo- 
rithms into two groups. Group 1 consists of those algorithms in which the up- 
dating of H k has the form 


H k + i = H k + ( Axk ~ H k Af k ) (z k ) T H k 

(z k ) T H k A f k 


(4.10) 


This group then includes the Broyden 1 algorithm, Algorithms n and II-S, and 
the combination of Broyden 1 and Algorithm II-S. Similarly, Group 2 algorithms 
are classified by the form 


= + (Ax" - H> Af") 

(z k ) T Af k 


(4.11) 


Group 2 then consists of the Broyden 2 algorithm and Algorithms I and I-S. 

The results of the computer runs in this series of problems are summarized in 
Tables 4.1 through 4.4. When convergence occurred, as determined by || f || 2 < 10" 6 , 
the distance of the final iterate x kf from a true solution was always less than 1CT 6 
(as measured in the infinity norm). Usually, the closeness of x kf to a true solu- 
tion was in the order of 10” 8 . An asterisk next to the number of function evalua- 
tions for convergence indicates convergence to a solution other than the nominal 
solution x originally set up. In all cases, these other solutions were within 2tt 
of x and usually much closer. The term "Diverge" is used to indicate when an 
algorithm was divergent, the criterion being divergence to a distance (again in 
the infinity norm) of 100 or more from x. The term "Osc" is used to denote 
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Table 4.1 

Convergence of n = 2 Problems 


1st entry: No. of f evaluations to reduce ||f|| 2 below 10“ 6 . 

2nd entry: Max. absolute component error in final estimate of inverse Jacobian 
at solution. 


Algorithm 

Prob. #21 

Prob. #22 

Prob. #23 

Prob. #24 

Prob. #25 

Discrete 

13 

10 

46* 

10 

25 

Newton 

< 10" 6 

< 10" 6 

< 10" 6 

< 10" 6 

3.6 x 10" 6 

Broyden 1 

9 

7 

13 

8 

13 


.00103 

.00240 

.0130 

.00039 

.0575 

Alg II-S 

9 

8 

12 

8_ 

13 


6.9 x 10" 6 

2.5 x 10" 5 

.00758 

2.2 x 10' 6 

12.6 

Br 1 - Alg II-S 

9 

8 

13 

8 

13 


7.3 x 10~ 6 

4.4 x 10" s 

.00625 

1.4 x 10" 6 

12.5 

Alg II 

9 

8 

12 

8 

13 

(P x = n x 10" 3 ) 

6.9 x 10" 6 

2.5 x 10' s 

.0363 

2.2 x 10" 6 

.815 

Alg II 

9 

8 

12 

8^ 

13 

(p t = 2n x 10" 3 ) 

6.9 x 10“ 6 

.0345 

.0363 

2.2 x 10" 6 

.815 

Broyden 2 

11 

7 

Diverge 

8 

14 


.0168 

.00067 


.00020 

.0735 

Alg I-S 

10 

8 

Diverge 

8 

13 


7.8 x 10" 6 

2.6 x 10" 5 


2.4 x 10" 6 

287 

Alg I 

10 

8 

Diverge 

8 

13 

(Pj = n x 10' 3 ) 

7.8 x 1U 6 

2.6 x 10“ 5 


2.4 x 10" 6 

7.48 

Alg I 

10 

8 

Diverge 

8 

13 

(Pj = 2n x 10" 3 ) 

7.8 x 10" 6 

2.6 x 10" 5 


2.4 x 10" 6 

7.48 


* Converged to solution other than nominal one. 
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Table 4.2 

Convergence of n = 5 Problems 


1st entry: No. of f evaluations to reduce ||f|| 2 below 10~ 6 . 

2nd entry: Max. absolute component error in final estimate of inverse Jacobian 
at solution. • 


Algorithm 

Prob. #51 

Prob. #52 

Prob. #53 

Prob. #54 

Prob. #55 

Discrete 

61* 

25 

61 

31 

25* 

Newton 

< 10 -6 

< 10“ 6 

< 10“ 6 

\o 

1 

O 

rH 

V 

< 10“ 6 

Broyden 1 

23 

13 

17 

14* 

17* 


.138 

.00350 

.0165 

.0160 

.0503 

Alg n-s 

26* 

14 

18 

14* 

16* 


65200 

.0437 

.00081 

.0215 

.0316 

Br 1 - Alg H-S 

23 

14 

16 

15* 

17* 


1990 

.0328 

.0198 

.00529 

.200 

AlgH 

22* 

13 

16 

14* 

16* 

(P, = nx 10" 3 ) 

1.40 

.00034 

.00369 

.00935 

.00093 

AlgH 

21* 

13 

16 

14* 

17* 

(Pj = 2n x 10“ 3 ) 

.0882 

.00034 

.00969 

.0208 

1.30 

Broyden 2 

Diverge 

13 

28 

16* 

19* 



.00975 

.0930 

.0576 

.106 

Alg I-S 

Osc 56+ 

13 

37 

15* 

Osc 56+ 


4430 

.00114 

.0252 

.0651 

86.1 

Alg I 

30* 

13 

27 

15* 

43* 

(Pi = n x 10' 3 ) 

1.66 

.00114 

1.82 

.0651 

1.39 

Alg I 

30* 

13 

25 

15* 

40* 

(p x = 2n x 10“ 3 ) 

1.77 

.00114 

.00120 

.0651 

5.15 


*Converged to solution other than nominal one. 
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Table 4.3 

Convergence of n = 10 Problems 


1st entry: No. of f evaluations to reduce l|f|| 2 below 10' 6 . 

2nd entry: Max. absolute component error in final estimate of inverse Jacobian 
at solution. 


Algorithm 

Prob. #101 

Prob. #102 

Prob. #103 

Prob. #104 

Prob. #105 

Discrete 

45 

Diverge 

45 

89* 

56 

Newton 

< 10" 6 


< 10" 6 

1.2 x 10' 6 

< 10" 6 

Broyden 1 

21 

36 

22 

27* 

24 


.00963 

.169 

.00235 

1.37 

.00682 

Alg II-S 

21 

94 

21 

31* 

23 


.00489 

30.9 

.00348 

259 

.0192 

Br 1 - Alg II-S 

21 

62 

22 

29* 

24 


.00928 

10.2 

.00867 

294 

2.09 

Alg n 

20 

37 

21 

26* 

22 

(Pi = n x 10" 3 ) 

.00163 

2.07 

.00348 

.147 

.0215 

Alg II 

20 

40 

21 

26* 

23 

(p j = 2n x 10" 3 ) 

.00163 

.284 

.00360 

.152 

.0371 

Broyden 2 

23 

Diverge 

23 

31* 

26 


.0105 


.00470 

.231 

.0436 

Alg I-S 

21 

Diverge 

21 

29* 

28 


.0791 


.00355 

83.2 

.410 

Alg I 

21 

Diverge 

21 

28* 

27 

(P[ = nx 10' 3 ) 

• uT&j. 


.00355 

3.77 

.437 

Alg I 

21 

Diverge 

21 

28* 

27 

(p t = 2n x 10~ 3 ) 

.00547 


.00213 

2.51 

.414 


Converged to solution other than nominal one. 
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Table 4.4 

Convergence of n = 15 Problems 


1st entry: No. of f evaluations to reduce ||f|| 2 below 10" 6 . 

2nd entry: Max. absolute component error in final estimate of inverse Jacobian 
at solution. 


Algorithm 

Prob. #151 

Prob. #152 

Prob. #153 

Prob. #154 

Prob. #155 

Discrete 

81 

129* 

Diverge 

161 

97 

Newton 

< 10“ 6 

< 10" 6 


< 10“ 6 

< 10“ 6 

Broyden 1 

28 

Diverge 

35 

36* 

31 


.00449 


.0153 

.366 

.0127 

Alg n-s 

28 

Osc 91+ 

33 

34 

30 


.0193 

1.55 

.157 

2.64 

.0719 

Br 1 - Alg II-S 

28 

Diverge 

36 

35 

30 


.00323 


.116 

406 

.00996 

Alg I| 

28 

51* 

33 

32 

30 

(Pi = n x 10" 3 ) 

.0193 

.538 

.0313 

.133 

.0289 

AlgH 

28 

41* 

33 

32 

30 

(fi x = 2n x 10" 3 ) 

.00584 

1.19 

.0262 

.118 

.00922 

Broyden 2 

32 

Diverge 

Diverge 

35 

32 


.0300 



.846 

.0330 

Alg I-S 

30 

Osc 91+ 

61* 

34 

31 


.152 

2.70 

11.2 

12.5 

.155 

Alg I 

30 

Osc 91+ 

63 

34 

31 

(Pi = nxlO' 3 ) 

.0272 

.595 

.411 

2.08 

.152 

Alg I 

31 

56* 

106* 

34 

31 

(Pj = 2n x 10" 3 ) 

.0285 

1.74 

.753 

1.58 

.211 


* Converged to solution other than nominal one. 
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convergence to the vicinity of a solution but lack of convergence to a solution 
within the stated (large) number of function evaluations. 

For each problem, the algorithm (s) providing convergence with the least number 
of function evaluations is (are) identified by the underlining of this number. 
Similarly, the quasi-Newton algorithm(s) yielding the smallest component error 
in H kf is (are) identified by the underlining of this error. In many problems 
more than one algorithm provided convergence with the minimum number of 
evaluations. On occasion, more than one quasi-Newton algorithm yielded both 
the minimum number of evaluations and the minimum component error in H k( . 
When this occurred, or when more than one algorithm yielded the same two 
numbers regardless of whether they were minima or not, the reason was that 
these algorithms were following exactly the same path. Thus for example in 
Problem #21, both cases of Algorithms II as well as Algorithm II-S followed 
identical paths since the criteria associated with Algorithm II were initially 
satisfied at every step. In Problem #52, both cases of Algorithm II happened to 
follow the same step-by-step procedure (but not the same as in Algorithm II-S) 
even though the associated p x criteria differed by a factor of two. 

The algorithms in Tables 4.1 through 4.4 are arranged in the order of discrete 
Newton's Method, then Group 1 algorithms and lastly Group 2 algorithms. A 
quick perusal of these tables reveals that the discrete Newton was generally the 
least efficient in terms of function evaluations. This was pretty much expected 
in view of the required n + 1 function evaluations per iteration. The discrete 
Newton converged to a solution in every problem except two (Problems #102 
and #153). That divergence occurred twice is not surprising. The multiplicity 
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of roots in a small region implies that the Jacobian is singular at many points 
in this region. A chance landing close to one of these points could throw the 
next iterate far out with no chance of recovery. This apparently occurred in 
Problems #102 and #153. When convergence occurred, the discrete Newton 
always provided the most accurate estimate of H kf (generally within 10“ 6 of 
J (x) ), as would be expected. 

A further perusal of these tables reveals that as a group, the Group 1 algorithms 
were generally more effective than Group 2, this behavior becoming more pro- 
nounced as n increased. Moreover this is the case in an algorithm by algorithm 
comparison of corresponding algorithms in each group. Thus Broyden 1 was 
generally more effective than Broyden 2 , Algorithm n more effective than Algo- 
rithm I, and Algorithm II-S more effective than Algorithm I-S. In the case of 
Broyden 1 and Algorithm II, their greater effectiveness (over the corresponding 
versions in Group 2) may be due in part to their inherently avoiding H k becoming 
singular, as previously shown in this report. In the case of Algorithm II-S vs. 

I-S, neither algorithm insures against the singularity of H k . However when k < n , 
H k is less likely to become singular in Algorithm II-S than in I-S. This is shown 
by the following. 

Assume that H k is non-singular. Then as shown in Section I, H k + 1 in Algo- 
rithm II-S will become singular (for Ax k ^ 0) if and only if 

( z k ) T Ax k = 0. (4.12) 

This occurs only when Ax k lies in the subspace of the previous vectors Ax‘, 

0 < k - i < n . In the case of Algorithm I-S, a similar development shows 
that H k + 1 becomes singular (for Ax k ^ 0) if and only if 
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(z k ) T J k Ax k 


0 . 


(4.13) 


Recall that in Algorithm I-S, z k is formed to be orthogonal to the previous 
vectors Af 1 , 0 <k - i <n,or 

(z k ) T Afi = 0, 0 < k - i < n . (4.14) 

But Af * satisfies 

Afi = Jk Ax 1 , 0 < k - i < n (4.15) 

which when substituted in (4.14) yields 

(z k ) T J k Ax 1 = 0, 0 < k - i < n . 

Thus the vector (J k ) T z k in Algorithm I-S is always orthogonal to the vectors Ax 1 , 
0<k - i <n. IfAx k lies in the subspace formed by these vectors, then (J k ) T z k 
must be orthogonal to Ax k as well, showing that (4.13) holds. In addition, when 
k < n, it is possible for the vector (J k ) T z k to be orthogonal to Ax k even if the 
latter lies outside the aforementioned subspace since this subspace would be of 
dimension less than n - 1. If one assumes that the vectors Ax k in both algorithms 
are more or less randomly oriented, then there is greater likelihood of H k be- 
coming singular in Algorithm I-S than in II-S. This may be a factor in the better 
performance of Algorithm II-S compared to Algorithm I-S. 

To facilitate the comparison of performance of algorithms within a group. 

Tables 4.5 and 4.6 have been prepared. For each set of problems, these tables 
list the number of times an algorithm reached convergence in the fewest evalua- 
tions, the number of times an algorithm yielded the most accurate matrix H kf , 
and the number of times the algorithm failed to converge. The first two 


4-15 



Table 4.5 

Comparative Performance of Group 1 Algorithms 

t 

1st entry: No. of times algorithm had fewest f evaluations in a series of 
problems. 

2nd entry: No. of times algorithm had most accurate estimate of inverse 
Jacobian at solution. 

3rd entry: No. of times algorithm failed to converge (in parentheses). 


Algorithm 

n = 2 
Problems 

n = 5 
Problems 

n = 10 
Problems 

n = 15 
Problems 

Alg II 

4 

4 

4 

4 

(Pi = n x 10" 3 ) 

2 

2 

2 

1 


(0) 

(0) 

(0) 

(0) 

Alg II 

4 

4 

3 

5 

(p x = 2n x NT 3 ) 

1 

2 

1 

2 


(0) 

(0) 

(0) 

(0) 

Alg II-S 

4 

2 

1 

3 


2 

1 

0 

0 


(0) 

(0) 

(0) 

(1) 

Broyden 1 

4 

2 

1 

1 


1 

0 

3 

1 


(0) 

(0) 

(0) 

(1) 

Br 1 - Alg II-S 

3 

1 

0 

2 


2 

1 

0 

1 


(0) 

(0) 

(0) 

(1) 
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Table 4.6 

Comparative Performance of Group 2 Algorithms 


1st entry: No. of times algorithm had fewest f evaluations in a series of 
problems. 

2nd entry: No. of times algorithm had most accurate estimate of inverse 
Jacobian at solution. 


3rd entry: No. of times algorithm failed to converge (in parentheses). 


Algorithm 

n = 2 
Problems 

n = 5 
Problems 

n = 10 
Problems 

n = 15 
Problems 

Algl 

3 

3 

3 

3 

(Pi = nx lO' 3 ) 

3 

2 

0 

2 


(1) 

(0) 

(1) 

(1) 

Algl 

3 

4 

3 

3 

(p t = 2n x 10~ 3 ) 

3 

2 

2 

1 


(1) 

(0) 

(1) 

(0) 

Alg I-S 

3 

2 

2 

4 


3 

1 

0 

0 


(1) 

(2) 

(1) 

(1) 

Broyden 2 

2 

2 

1 

0 


1 

1 

2 

2 


(1) 

(1) 

(1) 

(2) 
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comparison numbers are based on the best performance in each problem of the 
algorithms in the group rather than of all algorithms tested. 

Table 4.5 reveals that either case of Algorithm II was generally more effective 
than any of the other algorithms in the group. Besides yielding the fewest function 
evaluations in general, Algorithm II converged in all of the 20 random problems 
under test whereas the other algorithms in the group each failed to converge 
once (in Problem #152). In addition, the convergence of Algorithm EI-S was very 
poor in one other problem (Problem #102) where it required 94 evaluations to 
reach a solution. The combination of Broyden 1 and Algorithm H-S did not per- 
form any better on the average than the individual algorithms comprising it. In 
the aforementioned problem (#102) when Algorithm II-S had poor convergence, 
this combination did much better than Algorithm II-S but still much worse than 
Broyden 1. It would seem that Broyden 1 is to be preferred over this combina- 
tion algorithm. 

Table 4.6 reveals that Algorithm I with p t = 2n x 10“ 3 was generally more 
effective than any of the other algorithms in the group. Although it failed to 
converge twice (in Problems #23 and #102), none of the other algorithms in the 
group converged in these two problems. In addition, Algorithm I-S and Broyden 2 
failed to converge three additional times. Algorithm I with p x = n x 10“ 3 was 
not quite as effective as with p x = 2n x 10" 3 although it still performed better 
on the average than either Algorithm I-S or Broyden 2. Both cases of Algorithm I 
generally yielded smaller errors in H kf than did Algorithm I-S. In addition 
Algorithm I-S often had some strikingly large errors in H kf even when it con- 
verged (see tabulation for Problems #25, #104, #153, and #154). Broyden 2, 


ft 
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when it converged, yielded a fairly accurate H kf , almost as good on the average 
as did Algorithm I. 

It is of interest to examine the cases where Algorithm I-S or II-S converged to 
a region near a solution but failed to converge to a solution (denoted by "Osc" 
in Tables 4.1 through 4.4), and also those cases where either of these algorithms 
required an excessively large number of evaluations to reach a solution. In 
problem #51, Algorithm I-S reached the vicinity of a solution within about 25 
function evaluations but oscillated erratically thereafter. When terminated, the 
error in (H k - J" 1 (x) ) for either of the known solutions in the vicinity was an 
astounding 4430. This number also approximated the magnitude of the largest 
component of H k since the components of J" 1 (x) were less than one in all of 
the problems. At the same time, the determinant of H k measured only -1.3 x 10~ 9 , 
indicating a poorly conditioned, near singular matrix. The relative independence 
of the vectors Ax 1 comprising the matrix AX k , as measured by program COND 
described in Appendix 2, was only 6.0 x 10“ 13 . Thus the poor conditioning of this 
set of vectors was apparently the major factor in causing an erroneous, near- 
singular matrix H k , and in preventing convergence. 

In problem #102, Algorithm II-S required 94 function evaluations to reach the 
expected solution, with the error in H kf being quite large (30.9). Again H kf was 
near singular (det H kf = 7.8 x 10” 2 °) and the conditioning of the vectors Ax 1 
comprising A X kf was extremely poor (6.0 x 10" 3 2 ). Periodic measurements 
during this run of the error in H k , the determinant of H k , and the conditioning 
of AX k showed the same pattern. This same pattern was also in evidence in 
Problem #55 where Algorithm I-S exhibited continual oscillatory behavior in the 
vicinity of a solution, and in Problem #152 where Algorithms I-S and II-S both 
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exhibited erratic oscillation in the vicinity of a solution and failure to converge. 
Thus the inability of Algorithms I-S and II-S to prevent near singular matrices 
H k appears to be a factor in delaying or preventing convergence at times. 

Algorithm I exhibited a related type of behavior in two cases, Problems #152 
and #153. In Problem #152, Algorithm I with p t = n x 10~ 3 failed to converge 
after 91 function evaluations although it was close to a solution. However its 
behavior near the solution was not at all erratic and the iteration would have 
undoubtedly converged to a solution if the run had not been terminated. There 
had been no change in H k and D k for about 25 iterations preceding termination 
since A f k at each of these steps was insufficiently independent of any set of 
(m k - 1) of the previous vectors A f 1 retained in D k . The value of II f k || 2 was 
geometrically decreasing with a ratio of between .91 and .94 over these 25 itera- 
tions indicating (slow) linear convergence. The unchanging matrix H k was near 
singular and the conditioning of the vectors Ax 1 corresponding to the aforemen- 
tioned A f * was very poor. This situation is not altogether unexpected since only 
the independence of the vectors A f 1 retained in D k is assured in Algorithm I. 

As noted in Chapter III, the independence of the corresponding vectors Ax 1 would 
be assured only if these vectors had been formed from iterates sufficiently close 
to the solution. This apparently was not the case in this particular problem. 

In Problem #153 with p x = 2n x 10“ 3 , a similar situation occurred although here 
the iteration was allowed to continue to a solution (in 106 evaluations) because 
the linear rate of convergence was better (about 0.76). This rate of convergence 
existed over the 50 iterations preceding convergence, during which time H k and 
D k (= AF*) was retained unchanged for the same reason as before. Again the 
matrix H k was near singular and the conditioning of the vectors Ax 1 , corresponding 
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to the vectors Af 1 retained in D k , was very poor. Despite these two cases of 
very slow convergence. Algorithm I on the average showed better performance 
than Secant Algorithm I-S, both in the number of evaluations required for con- 
vergence and in the accuracy of H kf . 

A number of runs were made with Algorithms I and II in which the value of p x 
was either much larger or much smaller than the "nominal" values of n x 10“ 3 
and 2n x 10“ 3 . Sometimes these runs produced better results than for the 
"nominal" values of p x . More often, the results were not quite as good. In 
general, too small a value of p l simply resulted in Algorithms I and n degen- 
erating into Algorithms I-S and II-S respectively. Too large a value of p x often 
delayed or prevented convergence. No thorough attempt was made to optimize 
the selection of p x . The rule of thumb developed for selecting "nominal" values 
of /?! proved adequate to demonstrate improved performance of Algorithm I 
over I-S and of Algorithm II over II-S in this series of problems. While the 
results from this one series of problems are not conclusive, these results do 
support the claim that Algorithms I and II are improvements over Secant Algo- 
rithms I-S and II-S respectively. 

As noted earlier, Algorithm n showed better performance than did Algorithm I. 
This was true in almost every problem, in terms of ability to converge, the 
number of function evaluations required, and the accuracy of H kf . As discussed 
earlier, this is attributed in large measure to the inherent ability of Algorithm II 
to avoid singular or near singular matrices H k , which is lacking in Algorithm I. 
This characteristic is directly associated in Algorithm II with the selection and 
retention of "sufficiently" independent vectors Ax 1 comprising C![. At the same 
time Algorithm II insures that H k is well defined (as does Algorithm I) through 
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the requirement that the associated vectors Af* be "sufficiently" independent 
(as determined by criterion (2.36)). 

Criterion (2.36) utilizes the parameter p 2 which was set at 0.1/Oj in this series 
of problems as previously discussed. Probably as a result of the much smaller 
value of p 2 compared to p x , the relevant criterion associated with p 2 was gen- 
erally not the determining criterion in terms of rejection of a trial vector a j c . 
That is, if criterion (2.35) was satisfied, then criterion (2.36) was usually (but 
not always) satisfied as well. Perhaps Algorithm II could function almost as 
well in the great majority of cases by utilizing only criterion (2.35) which assures 
good conditioning of the retained vectors Ax*. Possibly the good conditioning of 
the corresponding vectors Af* is not quite as important in general. It might be 
useful in future research to compare the performance of Algorithm II in a num- 
ber of cases with a simplified Algorithm II in which criterion (2.36) is dropped. 

In this series of problems, Algorithm II showed better performance than Broy- 
den 1 and Algorithm I better than Broyden 2. The ineffectiveness of Broyden 2 
has already been noted by Broyden [ 4) so further comparison is not warranted. 
Consider the performance of Algorithm II vs Broyden 1. Algorithm n showed up 
somewhat better in terms of the primary criterion, the number of function 
evaluations required for convergence. However, the calculations per iteration 
in Broyden 1 are much simpler than in Algorithm II (or II-S for that matter). 

Thus the criterion of function evaluations is not quite fair to Broyden 1 , since 
a small average increase in the number of function evaluations may be more 
than balanced by the decreased calculations at each step. About the only area 
where Algorithm II showed a decided edge over Broyden 1 is in the ability to 
reach a solution in this series of problems. Algorithm II converged in all of 
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the 20 problems under test whereas Broyden 1 failed once (in Problem #152). 

In view of the foregoing, further numerical experimentation in many types of 
problems would be required to establish the merit of Algorithm II relative to 
Broyden 1. The results of the present numerical work suggest that Algorithm II 
is competitive with Broyden 1 and may have an edge in the ability to reach a 
solution in many problems. 
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SECTION V 


CONCLUSIONS 

Based on the preceding developments, the following conclusions have been 
reached. 

1. Algorithms I and II are improvements over Secant Algorithms I-S and 
EI-S respectively, in terms of ability to converge to a solution x of f (x) 
= 0, the number of function evaluations required, and in the closeness 
of the final matrix H kf to J -1 (x). 

2. Secant Algorithms I-S and n-S have a tendency towards poor conver- 
gence, especially as the order of the system increases. This is related 
to their inability to insure that H k exists and is non-singular. Algo- 
rithm II-S is judged to be the better of the two. 

3. Algorithm n is considered to be quite effective and warrants further 
study. Algorithm I, while more effective than Secant Algorithm I-S, 
does not measure up to Algorithm n in general performance. 

4. Algorithm I insures the existence of H k while Algorithm n insures 
both the existence and non-singularity of H k . These characteristics 
together with related factors are judged responsible for the better per- 
formance in general of Algorithm I compared to I-S, of Algorithm n 
compared to n-S, and of Algorithm II compared to I. 

5 . Algorithms I and II are only moderately more complex than Secant 
Algorithms I-S and II-S respectively. The added complexity consists 
of the determination at each step of whether the associated criteria are 


5-1 



satisfied. These criteria are relatively simple to apply. Algorithm II 
has two associated criteria, as opposed to one in Algorithm I, and thus 
requires somewhat more computations. 

6. If the criteria associated wife Algorithms I and II are initially satisfied 
at each step, then under certain additional assumptions these algorithms 
possess superlinear local convergence. If these criteria are not ini- 
tially satisfied at each step, these algorithms are still likely to possess 
linear convergence. 

7. Algorithm II appears to be competitive with Broyden’s. Method 1 in 
overall effectiveness. The simpler calculations per step in Broyden’s 
Method 1 are offset to some extent by the somewhat faster convergence 
of Algorithm II. In addition, Algorithm II may have an edge in fee 
ability to reach a solution in many problems. 

The following are a few suggestions for additional research in this area. 

1. Compare the performance of full step versions of Algorithm II, Algo- 
rithm II-S, and Broyden's Method 1 in a variety of problems including 
those of high order (n > 10). Experiment with other rules for selecting 
the parameters p x and p 2 in Algorithm II besides those used in the 
numerical part of this research. 

2. Repeat for one or more norm reducing versions of these three methods. 

3. Compare Jxc pC j .formance of Algorithm n in item 1 and/or item 2 with 
that of a simplified version in which the criterion associated with p 2 is 
dropped. 
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4. Determine the effectiveness of Algorithm n when used in conjunction 
with the Freudenstein-Roth technique for solving some very difficult 
problems. The final estimates x kf and H kf of one stage in the process 
could serve as the initial estimates x 1 and H 1 for the next stage. Com- 
pare the overall performance with that obtained when Newton's Method 
or Broyden's Method 1 is used in this process. 
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APPENDIX 1 


ALTERNATE ORTHOGONA LIZ ATION PROCEDURE 
FOR ALGORITHMS IB AND HB 

Algorithm IB requires the formation of b k , for possible use in Equations (2.4) 
and (2.12) , so as to satisfy 

(b k ) T Af k = 0, i = 1, 2 m k < n (Al.l) 

where A f k are the previous independent vectors retained in D k . Gram-Schmidt 
orthogonalization can be used to accomplish this. However, this can represent 
a considerable amount of work since each time Af k is used to replace some 
earlier vector in forming D k + 1 , the formation of bj k + 1 at the next step must 
start at or near the beginning of the Gram-Schmidt computational cycle. 

A simpler orthogonalization scheme can be developed by utilizing the vectors b. k , 
i = 1 to m k , which are available at every step in Algorithm IB. As indicated 
previously, the vectors bj* and Af , k are related by 

( b i k ) T f m = S im (since B k D k = I). (Al.2) 

Consider the following expression for forming b k . 

m k 

b k = Af k - £ c. b k (A1.3) 

i = l 

where c ; are constants to be determined. Since Equation (Al.l) must be satis- 
fied, one obtains 
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0 


IB |C 

= (b k ) T Afk = (Afk)T A f k - £ c. (b k )* Afk 

i - 1 


m k 

= (Af k ) T Af m k - £ c i S im (using A1.2) 

i = 1 


= (Af k )T Af k - c . 

' ' in m 


(A1.4) 


Thus 


c. = (Af k ) T Af k 


(A 1.5) 


and substituting in (A1.3) yields 

m k 

b.k = Af k - 2] /(Af k ) T Af A b. k . (A1.6) 

i = l ' ' 

Equation (Al.6) represents an alternate method for forming b k so as to be 
orthogonal to the previous vectors Af. k retained in D k . It can be expressed in 
matrix form as 

b k = Af k - (B k )* (D k ) T Af k . (A1.7) 

The relation between the vector b k obtained by this alternate orthogonalization 
procedure (Al.6) and that obtained via Gram-Schmidt orthogonalization can be 
seen in the following derivation. The two different vectors b k will be denoted by 


and 


Zq = bH as obtained from G-S orthogonalization (Al.8) 


z k = b. k as obtained from Equation (Al.6). 
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Equation (Al.6) shows that z k is a linear combination of Af k and the m k vectors 
b k . An examination of the operation of Algorithm IB reveals that each vector b k 
is a linear combination of some row of B k_1 and the vector b k-1 . The vector 
b k “ 1 was formed either by orthogonalization (as z k_1 ) or as the j-th row of B k ~ K 
In the latter case, the vector A f k “ 1 replaced some earlier vector A f 1 in forming 

r. k 
U 1 * 

Extrapolating this process back to the beginning reveals that z k is a linear 
combination of Af k , some q earlier vectors Af j *, Af J 2 1 . . ., Af )q that had been 
replaced by q of the vectors A f £ in D k , and the remaining (m k - q) vectors Af k 
(which did not have to replace any earlier vectors). For convenience in notation, 
we can arbitrarily rearrange D k so that the latter set of (n^ - q) vectors occupy 
the first (m k - q) columns. Then the above described linear combination can be 
expressed as 

m k ~q q 

2 1 = Af k + 2Z s i Af k + £] t. Af jl (A1.9) 

i=l i=l 

The vector z k as obtained by Gram-Schmidt orthogonalization is that component 
of Af k which is orthogonal to the m k vectors Af k in D k . Expressed another 
way, z k equals A f k minus the orthogonal projection of Af k upon this subspace. 
The projection operator for this subspace (see e.g., [12]) is 

p d = D k ((D k )T D k )-^ (D k )T. (A1.10) 

Then z£ can be expressed as 

z k = Af k _ p c Af k = (i - p d) A _f k . (Al.ll) 
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Since z£ is also orthogonal to this subspace, 


P D z k = 0 (A1.12) 

and 

z k = z k - P D z k = (I - P D ) z k . (A1.13) 

From Equations (Al.ll) and (A1.13) 

(z G k - z k ) = (I - P D ) (Af k - z k ) 


or 


m k -q 




— 7 k 


Z?) = - (! - p D ) E s i Af i k + E Afil 


i = l 


i = 1 


(using (A1.9)). (Al.14) 

The first vector sum in (Al.14) lies in the subspace for which P D is the projection 
operator. Hence (Al.14) reduces to 

( Zq - Z k ) = - (I - P D ) £ Af jl . (A1.15) 

i = l 

The form of (A1.15) shows that (z k - z k ) = 0 (or z^ = z k ) if the q vectors A f M , 

. . ., A f jq all lie in the subspace formed by the vectors Af k comprising D k . This 
condition is "almost" satisfied in the following sense. The vector A f 1 1 was re- 
placed by some later vector Af kl because Af kl was "almost" a linear combina- 
tion of the vectors comprising D kl . Expressed another way, Af J1 was "almost" 
a linear combination of the vectors retained to form D kl+1 . Similarly Af 12 was 
replaced by a later vector Af k2 because Af J 2 wa s nearly a linear combination 
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of the vectors retained to form Dj 2 + 1 , etc. Then the vectors Af j 1 , . . ., Af j « 
are "almost" in the subspace of the vectors comprising Df so that (z£ - Zj k ) ~ o. 
The smaller p x is in the criterion of (2.5), the closer is this approximation. 

At k = p when n - 1 independent vectors AfP have been retained in Df, zf (if 
it is non-zero) will be the same as z<P (to within a scalar multiple which is 
irrelevant) , regardless of whether or not the replaced vectors Af are in the 
subspace spanned by the columns of D f. This subspace would be of dimension 
n - 1, and since both zf and zg are orthogonal to this subspace, they both must 
lie along the same line in n space. More formally, referring to (A1.15), 1 t £ Af Ji 
lies in n space and can be expressed as a linear combination of A f p and the n - 1 
vectors comprising . But (I -P D ) operating on this linear combination yields 
zero everywhere except for the component along Af p , or 

(zp - Z P) = -(I-P D ) M AfP 

= ~ M z g (using (Al. 11) ) 

or 

zf = (I+aOzJ. (A1.16) 

If no earlier vectors Af jl have been replaced up to the k-th instant, then (Al.15) 
implies that z k = Zl k . This is always the case in (Secant) Algorithm IB-S. Hence 
the use of the alternate orthogonalization procedure of (Al.6) in Algorithm IB-S 
yields identical results to those obtained by Gram-Schmidt orthogonalization. 

The same orthogonalization procedure just described can also be used in Algo- 
rithm IIB to form the trial vector a k satisfying 
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(a j k ) T Ax. k = 0, i = 1, 2 “He! m k < n (A1.17) 

where &x. k are the previous independent vectors retained in C k . Following the 
same procedure as before yields the orthogonalization formula 

mjc 

a. k = Ax k - ^ (Ax k ) T AxK af (Al.18) 

i = 1 

or in matrix form, 

a k = Ax k - (A k ) T (C k ) T Ax k . (Al.19) 

The same comments generally apply as before, with regard to the relation of a k 
as formed by (Al.18) to that obtained by Gram-Schmidt orthogonalization. Again 
at k = p, when n - 1 independent vectors AxP have been retained in Cp, the two 
procedures yield the same results (to within a scalar multiple) . When used in 
(Secant) Algorithm IIB-S, the orthogonalization procedure of (Al.18) yields 
identical results to that of Gram-Schmidt orthogonalization, for the same reasons 
as previously discussed. 
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APPENDIX 2 


PROGRAMMING 

All of the numerical work was performed on an APL/360 computing system. 

This system was selected for the numerical part of this research because of 
the simplicity of its instructions (especially with regard to vector and matrix 
operations) and because of the convenience of remote terminal operation. While 
the size of the workspace available to the user is somewhat limited in this sys- 
tem , it was adequate for the range of problems considered in this study (up to 
and including 15th order systems). The computer operations are performed with 
a precision of about 16 decimal digits. 

The programs listed at the end of this appendix were utilized for all of the 
numerical work. The programs were devised by the author except for INV (for 
calculating the inverse of a matrix) and DET (for calculating the determinant) , 
which were borrowed from an APL/360 Public Library. For any given method, 
the programming is generally not the most efficient in terms of computer opera- 
tions and storage requirements. Rather the programs were composed in any 
way that seemed convenient to the author . This in no way affects the validity of 
the results for any method nor the conclusions drawn therefrom. The programs 
can always be rewritten to minimize computer operations and storage require- 
ments for a particular method. 

As described in Chapter IV, the test problems considered are 

f (x) = E - (A sin x + B cos x) = 0 (A2.1) 

where the notation si/ x (or cos x) refers to the n-vector whose i-th component 
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is sin x i (or cos x £ ), where x k is the i-th component of x. The components of 
the n x n matrices A and B (denoted as AN and BN in programs F and FNR) are 
random integer sin, the range -100 to +100. The n-vector E (denoted as EN in 
programs F and FNR) is obtained from 

E = A sin x + B cos x (A2.2) 

where x (denoted as XN in program FNR) is an n-vector each of whose compo- 
nents is a random number between -rr and +rr. Thus x is automatically one 
solution of (A2.1). The starting estimate x 1 for a particular problem is formed 
as 

x 1 = x + O.lS (A2.3) 

where S (denoted as DN in program FNR) is also an n-vector whose components 
are each a random number between -n and +n. The problems were generated 
as follows. For each value of n considered, program RM was used to generate 
two 5 x n 2 matrices each of whose components is a random integer between 
-100 and +100. These matrices are denoted in program FNR as A2, B2, A5, B5, 
A10, B10, A15, B15 corresponding to n = 2, 5, 10 and 15 respectively. Similarly, 
program RV was used to generate two 5 x n matrices for each value of n, each of 
whose components is a random number between -n and +n. These matrices are 
denoted in program FNR as X2, D2, X5, D5, X10, D10, X15 and D15 corresponding 
to the aforementioned values of n. This then provided the basis for five individual 
problems for each value of n, or a total of 20 problems. A particular problem 
was set up by running program FNR after specifying the problem (function) num- 
ber via the variable FCT. Thus for example, setting FCT = 102 and running 
FNR established the second problem in the n = 10 series; the matrices AN and 
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BN would be formed from the second rows of A10 and BIO respectively while 
the vectors XN and DN would be formed from the second rows of X10 and DIO 
respectively. The initial value x 1 (denoted as XI in FNR) would also be estab- 
lished as per (A2.3). 

Once a problem was set up, an approximation H 1 to the inverse Jacobian at x 1 
(denoted as JX1) was formed via program JJNV , which formed a discrete approxi- 
mation to the Jacobian via program F and calculated the inverse via INV. The 
discretization parameter h was specified as 10“ 4 (via DEL). The matrix denoted 
as E in JINV is the identity matrix of order n. Then a particular run was speci- 
fied by assigning component values to the vector variable SEL, which selected 
among other things the particular algorithm to be used, the quasi-Newton step 
(full step was used throughout, i.e., a k =1), the maximum number of iterations 
permitted, the convergence criterion for || f k tl 2 (set at 10“ 6 throughout), and the 
value of Pj for use in Algorithm I or Algorithm n. The parameter p 2 in Algo- 
rithm II was arbitrarily fixed at 0.1 p x . 

The program BEGIN started a particular run. This program printed out several 
items for run identification purposes including the algorithm used, the function 
(problem) number, and the selected value of p x (when Algorithm I or Algorithm II 
was utilized) . This program also incorporated the programs INITIAL and 
ITERATE. The program INITIAL established initial values for several variables 
and ITERATE performed the actual iterations. Included in ITERATE are pro- 
grams AMEND and HNEW. The program AMEND formed Ax k , x k + 1 , f k + 1 , and 
A f k in the usual way, for subsequent use. The program HNEW then formed H k+ 1 
according to the particular algorithm that had been selected. For example in the 
discrete Newton's Method (ALG = 9), Il k+1 was formed as an approximation to 


A2-3 



j-i( X k + i) using program JINV. For the other methods, HNEW selected the 
appropriate program for calculating H k + 1 . After calculation of H k + 1 , the execu- 
tion returned to ITERATE where the iterate index k was incremented by one. 
After a check for convergence of || f k || 2 and for determination of whether k ex- 
ceeded the allowable limit, the process was repeated. 

Besides a discrete Newton's Method, die other methods used on each problem 
were Broyden's Methods 1 and 2 (ALG = 7 and ALG = 8 respectively), Secant 
algorithms I-S and II-S (ALG = 5 and 6 respectively), a combination of Broyden's 
Method 1 and Algorithm II-S (ALG = 10), Algorithm I (ALG = 1), and Algorithm II 
(ALG = 2). There was also provision for using Algorithms I and II with Gram- 
Schmidt orthogonalization (ALG = 3 and 4 respectively) but this was only used 
on a few problems since the computations were more extensive and there was 
generally little if any improvement over the simpler alternate orthogonalization 
scheme. The program names for forming H k + 1 in accordance with the afore- 
mentioned methods are BROYDEN, SECANT, BRIALGHS, ALGI, and ALGII. 

Program SECANT includes Algorithms I-S and II-S. When Algorithm I-S (II-S) 
is used, Algorithm IB-S (IIB-S) is in effect until k - n + 1 at which time a switch 
is made to Algorithm IA-S (IIA-S). Similarly, in program ALGI (or ALGII), Algo- 
rithm IB (IIB) is in effect until m k = n at which time a switch is made to Algo- 
rithm LA (UA). The operation of programs ALGI and ALGII will be described in 
the following paragraphs , with the program variables identified in terms of the 
nomenclature used elsewhere in this report. The description should also suffice 
to clarify the operation of the similar but simpler program SECANT. 

When Algorithm IB (IIB) is in effect, the vector Z, representing b k (aj 1 ), is 
formed first via an orthogonalization procedure. If the appropriate criteria are 
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satisfied, as implemented by program BETA, this value of Z is used to form 
ZMO, ZM, and HN, representing B}' 0 (A k 0 ), B k+1 (AJ'* 1 ), and H k + 1 respectively. 
If not, Z is formed as the j-th row of the previous ZM, j being the smallest in- 
teger as determined by program TEST such that the appropriate criteria are 
satisfied. This value of Z is then used to form the new matrices ZM and HN, 
i.e., B k + 1 (A} c+1 ) and H k+1 If no value of j in the range of 1 to M, i.e., 1 to m k , 
satisfies the criteria, then the previous matrices ZM, DM, and HN are retained 
unchanged. In each case the updated matrix DM represents D} c + 1 . 

If and when m k = n, Algorithm LA (HA) takes over. Then Z is formed as the j-th 
row of the previous ZM, which now represents (AF k ) -1 ((AX k ) _1 ), provided the 
appropriate criteria are satisfied, with j determined as before. This value of Z 
is then used to update the matrices ZM and HN. If these criteria cannot be satis- 
fied by any value of j in the range 1 to n, then the previous matrices ZM, DM, 
and HN are retained unchanged. The matrix DM, representing AF k (AX k ), is not 
actually needed in Algorithm IA (HA) but is retained as an aid in subsequent 
analysis. The program PM is utilized in ALGI and ALGII to form PJ, which 
represents the permuting matrix P. of order m k . Also required is the variable 
EM representing I . 

The program COND is used to determine a condition number C A of a matrix A. 

If A is n x n, C A is defined as | A k I of (3.6) (for A = AX k ). If A is n x m, m < n, 
C A is given by 

C A = ~ Cl — , a i = i-th column of A. (A2.4) 

* HaHI, 

i= l 2 

In either case, C A measures the relative independence of the columns of A, for 
use in later analysis. 
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[6] XN*X2ZRES\] 

[7] DN*D2ZRESil 

re] *F 0 

[9] F50s-*-(FC3’>100)/F10o 

[10] AN* 5 5 pASZRESil 

[11] BN* 5 5 pF5[FFF;] 

[12] XN*XSZFESil 

[13] DN*D$ZRES\Z 

[14] -*-F0 

[15] FI 00 r-*( FCT> 150)/F150 

[16] AN* 10 10 pA 1 0[FF5 ; ] 

[17] BN* 10 10 pF10[FFS{] 

[18] XN*X10ZRES s ] 

[19] DN*DlOZBESil 

r 20 ] -*-F0 

[21] F150:i4F«- 15 15 p>!15[FFS}] 

[22] BN* 15 15 pB15[FFS{] 

[23] XN*X15ZFESil 

[24] DN*D1 5 ZRES ; ] 

[25] FQ:EN*(.AN+.*10XN) + (BN+.*20XN) 

[26] X1*XN+0.1*DN 
7 

7 F [0] 7 
7 Y*F X 

Tl] Y*EN-(.(.AN+.xloX) + (.BN+.x2oX)) 

7 
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V BEGIN CD] V 

V BEGIN 

Cl] 47<7-*557Cl] 

C 2 ] 575P«*557[2] 

C 3 ] FCT+SEL C31 
[4] 51«*557[4] 

C 5 ] DEL+SEL C5] 

C 6 ] P50«-557C6] 

C 7 ] LIM+SEL C7] 

C 8 ] 5055-*55Z[8] 

C 9] P5757+S57C9] 

CIO] OpO 

Cll] 51:-*51 + (2x47(7)-l 

C 1 2 ] 'ALGORITHM I\ ALT. ORTROG.' 

C 1 3 ] -*5 2 

C 1 4 ] 'ALGORITHM II, ALT. ORTROG' 

Cl 5] -*5 2 

C 1 6 ] 'ALGORITHM I ; (7-5 ORTROG.' 

Cl 7] -*52 

C 1 8 ] *4Z50FI75Af 77; 5-5 ORTROG.' 

C 1 9 ] -*-52 

C 2 0 ] 'ALGORITHM 7-5* 

C 2 1 ] -*52 

C 2 2 ] 'ALGORITHM II -S' 

C 2 3 ] -*52 

C 24] 'BROYDEN 1* 

C 2 5 ] -*52 

C 2 6 ] 'BROYDEN 2* 

C 2 7 ] -*52 

C 2 8 ] 'DISCRETE NEWTON ; 55774 75 * ;557 
C 2 9 ] -*52 

C 3 0 ] 'COMBINATION ALG 77-5 455 BROYDEN 1* 
C 3 1 ] 52 : -*•( STEP> 1 ) /53 
C 3 2 ] * FULL STEP' 

C 3 3 ] -*54 

C 3 4 ] 53 s * REDUCED STEP ; SCHEME ' -.STEP 
C 3 5 ] 54 : * FUNCTION NO. *;F^7;’; 5 75 ’ ;5 

C 3 6 ] -*(475 = 9)/57 

C 3 7 ] -*(51 > 1 ) /55 

C 3 8 ] *51 75 7755 ATI; 55774 75 * ;557 

C 3 9 ] -*56 

C 4 0 ] 55: '51 75 7* 

[41] 5B:-*(475>4)/57 

[42] 'PRO TS ' iRHO 

[43] 57 : Op 0 

[44] INITIAL 

[45] ITERATE 

V 
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V INITIAL cm 7 
7 INITIAL 
Cl] ’ NX-*-Xl 
C 2 ] FNX+F NX 
C3] MFNX++/FNX*FNX 
C 4 ] K+-M+-Q+L+-0 

C 5 ] DM+Np 0 

C 6 ] ZM+SDM 
[7] •+(H1 = 1 ) /IN 

C 8 ] H N+E 

[9] ■*■0 

CIO] INiHN+JX 1 
7 


7 ITERATE CP] V 
V ITER A TE 
C 1 1 ITER : * 

C 2 ] FX+FNX 
C 3 1 H+HN 
C 4 ] MAGF+MFNX 
C 5 ] K+K + 1 

C 6 ] 'Ki *;*}'{ NORM OF FXi ' \MAGF* 0.5 

C 7 ] +(PRINT<3 )/LT 

C 8 ] +((S\K)*0)/LT 

C 9 ] 'COND. NO. OF Hi ' \C0ND H 

CIO] 'DET OF Hi ' \DT 

Cll] +{ALG > 6 )/LT 

C 1 2 ] ' COND . NO. OF DMi ' \C0ND DM 

C 13 ] 'DET OF DMi ' iDT 
C 1 4 ] LTi+((MAGF*O.S)<CONV)/SOL 
CIS] -*-(.PRINT<l)/LTl 

C 1 6 ] 'MAX DIFF FROM EXPECTED SOL i ’jT /\X-XN 
C 1 7 ] ■>( PRINT <2 ) /LT1 

CIS] 'MAX DIFF BETWEEN H AND JINV AT SOLi * }C/ • (N*N)p(H-JXF) 
Cl«] LT1 t-+(K>LIM) /TERM 
C 20 ] P+-H+ . xFX 

C 21 1 AMEND 
C 2 2 ] HNEW 
C 23 ] -*-ITER 

C 2 4 1 TERM '.'TERMINATED i NO. OF TTE RAT IONS EXCEEDS LIMIT' 

C 25 1 -END 
C?R] SOLt OpO 
C 27 1 +(N>S)/LTS 
C 28 1 • SOLUTION ISi ' 

C 2 9 ] X 

C 3 0 ] LTSi'MAX DIFF FROM EXPECTED SOLt * 5 T / I X-XN 
c 3 1 ] 'MAX ABS ERROR IN APPR JINV AT SOLt ' ; [ / | (NxN)p ( H-JXF ) 
C32] +(PRINT<*)/END 
C 3 3 ] 'APPROX JINV AT SOLUTION 1 ' 

C 3 4 ] H 

C 3 5 ] ENDi 3 0 pO 
7 
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7 AMEND CD] 7 
7 AMEND 

Cl] ■+■ ( STEP> 1 ) /RED 

[2] DX+P 
C3] NX+X+DX 
C4] FNX+-F NX 
C 5 ] MFNX++ /FNXx FNX 

C 6 ] -*-MR 21 

C 7 ] RED: REDUCE 
C 8 ] MR2UDF+FNX-FX 
C9] +(PRINT<3)/ 0 

CIO] * MAG N OF DXi ' / \DX 
Cll] 'MAGN OF DF j * 5 T / I DF 
7 


7 RNEW CD] 7 
7 HNEW 

Cl] +(ALG*3)/NW1 
C 2 ] RN+JINV NX 
C3] +OUT 

C 4] NWlt+(ALG=10)/NWS 
C 5 ] +(ALG<7)/NW2 

C 6 ] BROYDEN 
C 7 ] +OUT 

C 8 ] NW2:-+{ALG<S)/NW3 

C 9 ] SECANT 
CIO] +OUT 

Cll] NW3t + ((ALG*2)v(Ar J G = t*))/NWt* 
C 1 2] ALGI 
C 1 3 ] -*-OUT 
C 1 4 ] NWHxALGTI 
C 1 5 ] OUT : OpO 
C 1 6 ] •*■0 

[ 17 ] NW5 iBRIALGITS 
7 


7 JINV CD] 7 

7 RT+JINV XT \YT iVT iDYT il \ZT iBT 
[ 1 ] YT+F XT 
C 2 ] VT+\I+0 

C 3 ] LOOP: ZT+XT+DELxEt ;J«-J+l] 

C4] DYT<-(F ZT)-YT 
C 5 ] VT+VT.DYT 
C 6 ] -*■(! <N ) /LOOP 

C 7 ] BT+(*DEL)x(N t N)pVT 
C 8 ] HT*-INV((*BT) 

7 
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7 SECANT CD] V 

7 SECANT iUiViZiZDViZMOtPJiDMO 
Cl 1 V+DX-R+.xDF 

[ 2 ] +(ALG=6)/S2 

[3] SI iDV+DF 

[4] +53 

[5] S2tDV+DX 
C6] S3i+(K>N)/SH 

[7] Z+DV-($ZM) + .x((nDM) + .xDV 

[ 8 ] ZDV++/Z*DV 

[9] ZMO--ZM-(.*ZDV)x(ZM+.xDV)* ,kZ 

[10] ZM+(K,N)p((NxX-l )pZbfO)„Z*ZBV 

ril] DM+6i(X t N)p((NxX-l)t)QDM) ,DV 
Cl 2] •♦•55 

[13] 54 t Z-+-ZML 1 ; ] 

[14] EM+(N t N)pl,NpO 

Cl 5] ZMO+ZM+(*+/Z*DV)x(EMtl il-ZM+ ,xDV)» .xZ 

[16] M+N 

[17] PJ+PM 1 

[18] ZM+-PJ + . x ZMO 

[19] DM0+DM+ (DV-DMC ;l])o .xEbff ;1] 

[2 0] DM+-DM0 + . x ( $PJ ) 

[21] 55 :-*•(>! £5 = 6 )/56 

[ 22 ] V+Z 

[23] -+S1 

[24] 56 : F«-($5) + . xZ 

[25] S7 iHN+H+(*+/VxDF)*Vo . xV 
7 


7 BR1ALGIIS [ 0 ] 7 
7 BRlALGIIS‘,ViV',ZiZDViZMO‘ t PJiDMO 

[1] V+DX-R+.xDF 

[ 2 ] DV+DX 

[3] -*•( X>N )/554 

[4] Z4-DV-(<$7.M)+.x(<nDM) + .xDV 

[5] ZDV++/ZXDV 

[ 6 ] ZMO+ZM-(.*ZDV)x(ZM+,xDV)* .xZ 

[7] ZM+(.X,N)p<(NxX-l)pZMO),Z*ZDV 

[8] 7W-<-$(tf,tf)p<(ff**-l)o$7W) t T>\ 

[9] ■+555 

[10] 554:Z«-ZW[1;] 

[ 11 ] EM+(N t N)pl t NpO 

[12] ZM0+ZM+(*+/Z*DV)x(EMn } l-ZM+.xDV'to ,xZ 

[13] M+N 

[14] PJo-PM 1 

[15] ZM+PJ + . x Zb ! 0 

[16] DMQ+DM+(.DV-DMt ; 1 "I ) • . xFM[ } i] 

[17] DM*-DM0+.x(ts,pj) 

[18] SB 5 : ZC*-0. Sx ( Z+DV ) 

[19] F-»-($5)+. *ZC 

[ 20 ] HN+R+(* + /VxDF)xr/* ,xV 
7 
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7 BROXDEN CD] 7 
V BROXDEN \U\V\DH 
Cl 3 U+DX-H+ . *DF 
[23 +(ALG=B)/BR2 

[3] BR1 iV+(M) + .*DX 

[4] +BR 

[5 3 BR2 s V+DF 

[63 BRiDH+(*+/VxDF)xU* ,*V 
[73 HN+H+DH 
7 


V ALGI [f"l3 7 

7 ALGIiUiViZiZDViZMOiPJiDMO 

[13 U+DX-R+.xDF 

[23 DV+DF 

[33 -*-(Wi/?)M12 

[43 +(ALG=3)/GR1 

[53 Z«-DV- ( '*) + .x(S>DM) + .xDV 

[63 +A 11 

[7 3 <7i?l:Gtf4M 

[8 3 4 1 1 : Dfltf-*-Z 5FZV1 DK 

[93 +(CHK&RH0)/A15 

[103 ZDV++(ZxDV 

[113 ZMO«-ZM-(*ZDP)x(ZM+.*Dn*.*Z 
[123 ZM«-((M+1 ),N)p((NxM)fiZMO),Z*ZDV 

[13 3 DM«-$((M+1 ) t N)p( (NxM)n*DM) t DV 

[143 M+N + 1 

[153 * NEXT M INCREASED TOt ' }M 

[163 EM+(M t M)pl t MpO 
[17 3 -*414 

[18 3 415: 'D0 C’ff4tf(7£’ IN NEXT M\ BETA = 1 

[193 412: TEST 

[203 -*• ( CHA NGE ) /4 1 3 

[213 *JV0 CHANGE IN NEXT ZM , 4DD DM* 

[223 *►0 

[23 3 413:Z«-ZM[«7{3 
[243 ' Z IS ZM£ ' iJ { * { 3 * 

[253 ZDV++ / ZxDV 

[26 3 ZM0+-Ztf+(* ZDV)x(EMC ;Jl -ZM+ .xDV ) • . xZ 

[273 PJ+PM J 

[283 ZM+PJ+.xZMO 

[293 DM0+DM+ (DV-DMt iJ3)*.xEMZ ;^3 

[3 03 PV J ^MO+.xitzPJ) 

[313 Al*iHN+H+(*ZDV)xUo .xZ 
7 
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7 ALGII [□] 7 

7 ALGII lUiViZiZDViZMOtP'TiDMO 

[1] U+DX-H+.xDF 

[ 2 ] DV+DX 

[3 +(M*N)/A22 

T 4] +{ALG**)/GR2 

[5] Z4-DV-<.$ZM) + .x(<SiDM)+.xDV 

C6] +A 21 

[7] GR2:GRAM 

[ 8 ] A21 : CHK1+Z BETA DV 

[9] CHK2+(($H)+.xZ) BETA DF 

[10] +(.CHKliZRHO)/A25 

[11] +(CHK2i0.1xRHO)/A2S 
T12] ZDV++/ZxDV 

T 1 3 ] ZMO+ZM- <.*ZDV)x( ZM+ . xp V ) o . x Z 
[14] ZM +- ( ( Af + 1 ) ,tf)p( (RxM)cZMO) % Z*ZT)V 
T 1 5 ] ZW«-$( (M+l ) t N)p({NxM)n<HDM) t DV 

[16] Af+M+1 

[17] 'NEXT M INCREASED TO: » \M 
ri8] EM+(M t M)pl ,MpO 

T 1 9 ] -*-4 24 

r 20] A25:'NO CHANGE IN NEXT M\ GAMMA = 'iCHKU' t BET Ax 
[21] 422: TEST 
T 2 2 ] ^(.CHANGE) /4 23 

r 2 3 ] • NO CHANGE IN NEXT H , ZM t AND DM » 

[24] •*•0 

r 25 ] 4 23:Z«-ZM[J;] 

T 26 ] *Z IS ZMl * ;tT; ' ;]• 

T 2 7 ] ZPV«-+/Zx0V 

[28] ZM0+ZM+ ( * ZDV )x ( EMt iJ]-ZM+ , xDV)o t xZ 

[29] P«7«-Ptf .7 

[30] ZM+PJ + . x ZMO 

[31] DMO+DM+(DV-DMt ;,7] )o.xFW[ ;«/•] 

[3 2] DM+DM 0 + . x ( <5Pt7 ) 

[33] 424:K«-($P) + .xZ 

[34] +(PRINT<?)/A 33 

[3 5] » I-ZMxDM s * ;[/ | (MxM'>p(EM-ZM+ .xDM) 

[36] 433:ffF+P+(*+/Fx0F)x7/o.x7 
7 


• :CPF? 


# 
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V TEST CD] 7 
7 TEST 

ri] j-o 

r 2 D TliJ*J+l 
m *(J>M)/T3 
T4] TJ+ZMLJil BETA DV 
T 5 3 *(TJ>RH0)/T2 

C 6 3 +T 1 

m T2t*( (.ALG=l)v(ALG = 3))/Ttl 
f 8 3 VA*(.<HH)+.xZMZJil 
C 9 3 TJA*VA BETA DF 
[10] *(TJA>0.1xRHO)/T 4 

ril] »2V[' lJ\' ;] O.K.i TJAZ' iJl' ;]= ' iTJA 
[ 12 ] 

[13] T3 '.CHANGE* 0 
ri4] -*0 

T15] !T4: CHANGE*! 

7 


7 *£3V1 [[]] 7 

7 5<N-K1 BETA V2\MV\\Mv2\M12 

ri] tfl2«-+/VlKt r 2 

[ 2 ] WVl«-( + m*2)*0. 5 

r3] MV 2 *(+/V 2 * 2 )* 0 .S 
T4] SC*(*MVlxMV2)* \M12 
7 


7 PM [D] V 
7 A*PM J',VJ',PCJiZ.T 

ri] VJ*(Mpl )-EMZ iJ] 

F2] PCJ*VJ/Z 1] FA/ 
r3] ZJ-<-((Wx/./-l)pPCJ),£'W[ ;^] 

r4] A*(M,M)pZJ 
7 


7 FFylA/ [0] 7 
7 GRAMil \ZIM iDVI \ZT \MZI 

ri] i*o 

r 2 ] ZIM*Np 0 

m REP'. *( TaM) / LAST 

[4] J-kT+1 

[5] PVT+PH+ .xFMZ }7] 

[6] ZJ^PVT-Z7A/+.x($Z7A/)+.xPl'T 

[7] MZI*(.+/ZI*ZI)* 0.5 

[8] ZTM*$(.I t N)p{{NxI-l)p(*ZiM) % ZI*MZT 
T9] -*-FFP 

[10] LAST:Z*DV-ZIM+.x($ZIM)+.*DV 
7 
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v cond crn v 

V Y+COND A ; RA i I ; VM ; V I ; VIM 
Cl] P4+p4Cl»] 

C2] +(RA<N)/CDO 
C 3 ] DT+DET A 

[4] +CD 3 

T 5 ] CDOtDT+( \DET($A) + .*A >*0.5 

C6] CDZtI+l-VM+1 

C7] CDl:I+I+l 

C 8 ] +(I>RA)/CD 2 

[9] VT+Al- t n 

riO] VIM+(+/VIxVI)* 0.5 

ril] VM<rVIM*VM 

Cl 2] +CD 1 

ri3] <702: J+lPr+PW 

V 


7 

7 INV I’D] 7 

RB+INV RA iRKiRSiRPiRI 

n 3 

•*■(. ( 2=ppRA )a = /1 t pRA 'p4 

C 2 ] 

» NO INVERSE'.' 

C 3 3 

-y~RB + 1 

C 4 ] 

RK+ L / pRA 

C 5 ] 

RS-RK 

C 6 ] 

RP+\RK 

C 7 ] 

RA+RAl ;( \RS) ,1] 

C 8 ] 

04C ;l+05]*-( \RS)sl 

C 9 ] 

ri+( iP4[iPtf;n)ir/iP4CtP*r;n 

Cio] 

ppci ,px]«-ppcpj,i] 

Cii] 

RA c 1 ,RI i iRS]+RA [PI, 1 ; tPS] 

Cl 2 ] 

•*( 1J?""30> |Pi4Cl;l])o2 

C13] 

fMCl;3<-JMn ;]*JMCl :1] 

C 1 4 ] 

RA+RA-((~(iRS)sl )xRA[ ; 1 ] ) • . *P4 C 1 ; ] 

C 1 5 ] 

RA+RA [1+RS | iRS ; ( 1+ i RS ) , 1 ] 

C16] 

0P«-/?PC1+0S| IPS] 

ri7] 

+(0<RK+RK-1 ) / 8 

C 18 ] 

7 

RB+RAl jPPxxPS] 


7 0£T TO] V 
7 C-*-DET Z\J \Q 
Cl] -*-(l = p ,Z)p0 # C«-,Z 

C 2 ] +L2x\( 2*ppZ)A = /pZ 

C 3 ] +0,pO+' ILLEGAL STRUCTURE ' 

C 4 ] £2:-*-0M (1+pZ)</^(ZC/ 1 {] sOhC+, 0 

r 5 ] z-«-<«r-i)$z 

re] L6:Z«-Z-ZC ; 1 ] ° . *ZC 1 ? ]*C-^ZCl ;1 1 
C 7 ] <7«-( “l*«r-l )xC*DET 11 + Z 

7 
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